AMDGPU: Remove dead code configuring f16 is_fpclass
isTypeLegal can never be true here. The register classes
are registered at the end of the target lowering constructor,
and in the subclasses.
libc++ inttypes.h: define __STDC_CONSTANT_MACROS and __STDC_LIMIT_MACROS
Before transitively including the base version of inttypes.h, define
__STDC_CONSTANT_MACROS and __STDC_LIMIT_MACROS, because the base
inttypes.h directly includes sys/stdint.h, instead of going through the
'regular' stdint.h.
The libc++ version of the latter does define those macros, to ensure
things like UINT64_C() and SIZE_MAX are defined even in C++98 or C++03.
MFC after: 3 days
[NFCI][AMDGPU] Fix the predicate `HasDsSrc2Insts`
I'm not sure why the predicate has a `!`, and more surprisingly, removing it doesn't change anything.
[AMDGPU][GFX1250] Optimize s_wait_xcnt for back-to-back atomic RMWs
This patch optimizes the insertion of s_wait_xcnt instruction for
sequences of atomic read-modify-write (RMW) operations in the
SIInsertWaitcnts pass. The Memory Legalizer conservatively inserts a
soft xcnt instruction before each atomic RMW operation as part of PR
168852, which is correct given the nature of atomic operations.
However, for back-to-back atomic RMWs, only the first s_wait_xcnt is
necessary for better runtime performance. This patch tracks atomic
RMW blocks within each basic block and removes redundant soft xcnt
instructions, keeping only the first wait in each sequence. An atomic
RMW block continues through subsequent atomic RMWs and non-memory
instructions (e.g., ALU operations) but is broken by CU-scoped memory
operations, atomic stores, or basic block boundaries.
[SPIR-V] Implement sample and sample_clamp intrinsics for HLSL resources (#177234)
This patch implements the `sample` and `sample_clamp` intrinsics for
HLSL
resources in the SPIR-V backend. It adds the necessary intrinsic
definitions
in `IntrinsicsDirectX.td` and `IntrinsicsSPIRV.td`, and implements the
instruction selection logic in `SPIRVInstructionSelector.cpp`.
Key changes:
- Added `int_dx_resource_sample` and `int_dx_resource_sample_clamp`
intrinsics.
- Added `int_spv_resource_sample` and `int_spv_resource_sample_clamp`
intrinsics.
- Implemented `selectSampleIntrinsic` to handle
`OpImageSampleImplicitLod` generation.
- Added `ResourceDimension` enum in `DXILABI.h` and `HLSLResource.h`.
- Added a new test case
`llvm/test/CodeGen/SPIRV/hlsl-resources/Sample.ll` to verify the
implementation.
[flang] Support -f(no-)protect-parens (#170505)
Driver/compiler option plumbing to get -f(no-)protect-parens supported
on flang. (This option was already supported in clang, so extended the
option config to enable it in flang.)
In the compiler, support it in code gen options and in lowering options.
Hooked up lowering options with the code by @alexey-bataev that turns
off reassociation transformations.
Co-authored-by: Alexey Bataev <a.bataev at outlook.com>
[lldb] Unconditionally setup posix spawn responsible flag (#177451)
# Problem
The TCC support in LLDB was added by
https://github.com/llvm/llvm-project/commit/041c7b84a4b925476d1e21ed302786033bb6035f.
However, on newer macOS machines, when launching and debugging an
Catalyst app on macOS (see
[Host.mm](https://github.com/llvm/llvm-project/blob/1286de408cc4a3ba1bd6cb6fed7d9517c0429462/lldb/source/Host/macosx/objcxx/Host.mm#L1208-L1219)),
the TCC doesn't work as expected. This is because, even though the
launch info doesn't specify `eLaunchFlagInheritTCCFromParent`, the app
is still launched to inherit TCC from its parent (the LLDB). This
prevents the user from granting privacy access to the Catalyst app,
which is usually reflected in macOS' "Privacy & Security" settings.
For example, in the following screenshot (see PR), even when the microphone
access has already been granted to WhatsApp, trying to use it will still
cause a prompt (as if it's not granted already).
[13 lines not shown]
[clang-tidy] Add a new check 'modernize-use-string-view' (#172170)
Looks for functions returning `std::[w|u8|u16|u32]string` and suggests
to change it to `std::[...]string_view` if possible and profitable.
Example:
```cpp
std::string foo(int i) { // <---- can be replaced to `std::string_view foo(...) {`
switch(i) {
case 1:
return "case1";
case 2:
return "case2";
default:
return {};
}
}
```
Fix SATA NCQ error recovery after 25375b1415
Since that commit ahci(4), siis(4) and mvs(4) drivers ended up
using wrong command to fetch error information for NCQ commands.
Since ATA errors are not very informative to begin with, the only
noticeable effect is a lack of retries on those errors by CAM.
MFC after: 1 week
PR: 279978
(cherry picked from commit 87085c12ba8fa51f777bc636df67008b45e20d1c)
[Github] Add initial workflow to prune unused user branches
This patch starts implementing the long requested feature of a workflow
to automatically prune user branches that are not tied to a PR. For now
this just consists of a script that finds user branches not attached to
a PR and prints them out. Future patches will add support for dumping
the diff between the branches and main (to persist to artifact storage
so people can recover if they intended to use the branch) and actually
deleting the branches.
Reviewers: cmtice, tstellar, petrhosek, vbvictor
Pull Request: https://github.com/llvm/llvm-project/pull/175693
arm/gic: Detect broken configurations
Some virtualization platforms provide broken configurations. There
is a GIC interrupt controller, however accessing the CPU interface
registers leads to an external data abort. As these are needed to
handle interrupts we are unable to boot further.
Detect this misconfiguration and panic to tell the user the issue.
Reviewed by: emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D54832
[MLIR][XeGPU] Add two 8bit float types F8E4M3FN and F8E5M2 to valid XeGPU float type. (#169420)
These float types are already part of MLIR built in types.
This PR just adds them as valid float types for XeGPU dialect.
For bit format of the two float types, see
https://onnx.ai/onnx/technical/float8.html
[PowerPC] Fix instruction sizes / branch relaxation (#175556)
For PowerPC, having accurate (or at least not too small) instruction
sizes is critical, because the PPCBranchSelector pass relies on them.
Underestimating the size of an instruction can result in the wrong
branch kind being chosen, which will result in an MC error.
This patch introduces validation that the instruction size reported by
TII matches the actually emitted instruction size, and fixes various
cases where this was not the case.
Fixes https://github.com/llvm/llvm-project/issues/175190.
[mlir][Utils] Add verifyElementTypesMatch helper (NFC) (#176668)
This change builds on #174336 and #175880, which introduced shared
VerificationUtils with verifyDynamicDimensionCount() and
verifyRanksMatch() methods.
This patch adds a new verifyElementTypesMatch() verification utility
that checks if two shaped types have matching element types and emits
consistent error messages. The utility is applied to several ops across
the MemRef and Vector dialects.