[AMDGPU][ISel] Reduce `f64` compare to integer compare of upper half (#188356)
Truncate `f64` `setcc`s to upper 32-bit operands where possible.
These transformations are analogous to those in #181238, but for ordered
and unordered fp comparisons.
Fixes #187996.
Alive2 verification of transformations:
- For `eq` / `ne`: [ZRciR6](https://alive2.llvm.org/ce/z/ZRciR6)
- For `lt` / `ge`: [RDGnqr](https://alive2.llvm.org/ce/z/RDGnqr)
- For `le` / `gt`: [v0jlD5](https://alive2.llvm.org/ce/z/v0jlD5)
[lldb] Bring Debuginfod's StreamedHTTPResponseHandler to SymbolLocatorSymStore (#187687)
SymbolLocatorSymStore used a simple local implementation of
HTTPResponseHandler so far. That was fine for basic usage, but it would
cause issues down the line. This patch hoists the
StreamedHTTPResponseHandler class from libDebuginfod to SupportHTTP and
integrates it in SymbolLocatorSymStore. PDB file downloads will now be
buffered on disk, which is necessary since they can be huge.
We use the opportunity an stop logging 404 responses (file not found on
server) and print warnings for all other erroneous HTTP responses. It
was more complicated before, because the old response handler created
the underlying file in any case. The new one does that only once the
first content package comes in.
[LLVM][Support] add nonNull function helper (#188718)
We often see a pattern like:
```
T *ptr = doSomething()
assert(ptr && "doSomething() shouldn't return nullptr");
```
We also have functions like `cantFail`, but those are working with
Expected types.
This commits adds a `nonNull` function, which can be used inline. In
practice, one could use:
```
T *ptr = cast<T>(functionReturningT());
```
But it conveys the meaning that `functionReturningT` might return a
subtype/supertype that we actually cast.
[7 lines not shown]
[AMDGPU][CodeGen] Implement SimplifyDemandedBitsForTargetNode for readlane, wwm and set.inactive intrinsics. (#190830)
Propagate demanded bits through readlane, wwm, set.inactive intrinsics
in AMDGPUISelLowering in SimplifyDemandedBitsForTargetNode.
This allows upstream zero/sign extensions to be eliminated when only a
subset of bits is used after intrinsics.
Partially addresses https://github.com/llvm/llvm-project/issues/128390.
[GlobalISel] Prevent hoisting of CheckIsSameOperand from creating invalid match tables
Fixes #188513
This patch adds logic to ask PredicateMatchers whether they'd like to be hoisted out of a specific Matcher or not.
SameOperandMatcher can use it to check if it's being hoisted out of the RuleMatcher that defines the operand it relies on.
Assisted-By: Claude Opus 4.6
Context of Use: Claude was only used to add LLVM-style RTTI to the matcher class (repetitive work). Claude-generated code was reviewed and cleaned up before committing.
[clang][CIR] Add lowering for vcvt_n_ and vcvtq_n_ conversion intrinsics
This PR adds lowering for the conversion intrinsics with an immediate
argument (identified by `_n_` in the intrinsic name), excluding FP16
variants.
It also moves the corresponding tests from:
* clang/test/CodeGen/AArch64/neon_intrinsics.c
to:
* clang/test/CodeGen/AArch64/neon/intrinsics.c
The lowering follows the existing implementation in
CodeGen/TargetBuiltins/ARM.cpp and adds the `getFloatNeonType` helper
to support it. The remaining changes are code motion and refactoring.
Reference:
[1] https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#conversions
[NVVM] Update properties for non-sync variants of the SHFL intrinsics (#189615)
Non-sync SHFL variants (shfl without .sync) are pure functions of their SSA operands and the active thread mask. Assign IntrReadMem, IntrInaccessibleMemOnly and IntrWillReturn so that: - Reading the implicit mask state is modeled for correct ordering with other convergent operations - Truly dead non-sync shfl code can still be DCE'd
Sync SHFL variants keep IntrInaccessibleMemOnly (no IntrReadMem, no IntrWillReturn) to model synchronization side effects and prevent unsafe DCE/reordering.
[NVPTX] Lower nvvm.fmax to maximumnum not maxnum (#189976)
Converting nvvm.{fmin/fmax} into llvm.{min/max}num is slightly
incorrect, as {min/max}(a, sNaN) should produce "a" according to the PTX
spec, but LLVM's {min/max}num intrinsics may return either NaN or "a".
Use the {min/max}imumnum intrinsics instead for correct sNaN behaviour.
Also tidy up NVVM FMin/FMax constant-folding using these tighter
definitions of how the NVVM intrinsics map to {min/max}imum and
{min/max}imumnum.
[lldb][AArch64][Linux] Add tests for SME only core files (#189985)
Part of #138717.
This did not require any changes to core file handling. Since a static
snapshot of an SME only system looks pretty much the same as one from
the same state on a system with SVE and SME.
For this reason, we're only testing 2 combinations. In total these
include streaming and non-streaming, ZA on and off, and 2 different
vector lengths. I think this is enough to prove that the existing code
is working.
[AArch64][clang] Use tablegen rather than hard-coded feature dependencies
Refactor AArch64 frontend feature handling so extension relationships come
from the TargetParser extension graph instead of hand-written dependency
code in C++. This makes `llvm::AArch64::ExtensionSet` the source of
truth for dependency expansion while still keeping the short `Has...` names
used in the frontend code.
This removes a large amount of duplicated implication logic from
`handleTargetFeatures` and related feature queries. The frontend now
rebuilds its extension state from TableGen-derived data and then derives
its cached feature state from that, rather than maintaining parallel
dependency rules in C++.
I also preserved several pieces of historical frontend behaviour that are
not represented directly in the extension graph. Explicit disables such as
`no-sme` still win after implied-feature expansion, direct `+fullfp16` and
`+jscvt` still restore the expected NEON-facing state, and SME-family
features no longer incorrectly appear to enable AdvSIMD/NEON.
[4 lines not shown]
[orc-rt] Refactor QueueingTaskDispatcher to use an external TaskQueue. (#190920)
QueueingTaskDispatcher now takes a TaskQueue by reference rather than
maintaining an internal queue. This lets API clients retain direct
access to the queue after transferring dispatcher ownership to the
Session.
TaskQueue operations (takeFirstIn, takeLastIn) are blocking: callers
wait until a task arrives or the queue is shut down. This enables a
simple client idiom:
```
QueueingTaskDispatcher::TaskQueue TQ;
Session S(std::make_unique<QueueingTaskDispatcher>(TQ), ...);
S.attach(<controller access>);
while (auto T = TQ.takeFirstIn())
T->run();
```