[CIR] Add cir.atomic.xchg to target lowering (#180744)
This patch adds the `cir.atomic.xchg` operation to the TargetLowering
pass. The synchronization scope attached to the operation will be
canonicalized there.
[AArch64][ISel] Lower fixed-width i64 vector CLMUL intrinsics (#178876)
NEON's PMULL/PMULL2 can be used and its lower bits taken to lower CLMUL
intrinsics, so long as +aes is present.
[BOLT][BTI] Patch ignored functions in place when targeting them with
indirect branches
When applying BTI fixups to indirect branch targets, ignored functions are
considered a special case:
- these hold no instructions,
- have no CFG,
- and are not emitted in the new text section.
The solution is to patch the entry points in the original location.
If such a situation occurs in a binary, recompilation using the
-fpatchable-function-entry flag is required. This will place a nop at all
function starts, which BOLT can use to patch the original section.
Without the extra nop, BOLT cannot safely patch the original .text section.
An alternative solution could be to also ignore the function from which
the stub starts. This has not been tried as LongJmp pass - where most
[3 lines not shown]
[MLIR][LLVMIR] Add support for importing ConstantInt/FP vector splats. (#180946)
Updates LLVM IR importing to remove the assumption that
ConstantInt/ConstantFP are always scalar.
[AArch64] Eliminate XTN/SSHLL for vector splats (#180913)
Combine:
sext(duplane(insert_subvector(undef, trunc(X), 0), idx))
Into:
duplane(X, idx)
This avoids XTN/SSHLL instruction sequences that occur when splatting
elements from boolean vectors after type legalization, which is common
when using shufflevector with comparison results.
[llvm-reduce] Add a pass to replace unconditinal branches with returns
Unconditional branches could end up in infinite loops in the reduced code,
while the code could have been reduce furter.
This patch implements a simple pass that replaces unconditional branches
with returns.
[lldb][doc] Improve documentation for `ScriptedFrameProvider` (#179996)
* Provide a minimal, working example
* Document the instance variables
* Remove mention of `thread.SetScriptedFrameProvider` (which doesn't exist)
* add missing `@staticmethod` annotation
* fix rendering of bullet-pointed lists
[libc] Add getc, ungetc, fflush to enable libc++ iostream on baremetal (#175530)
After https://github.com/llvm/llvm-project/pull/168931 landed getc,
ungetc and fflush are still missing at link time while trying to make
libc++ std::cout work with LLVM libc on baremetal.
ungetc implementation is very minimal only to cover the current standard
streams implementation from the patch above.
[libc++] Avoid including pair in <__functional/hash.h> (#179635)
We already have `_PairT`, which is just a pair of two `size_t`s, so we
might as well use that throughout the file. This avoids the `pair`
include altogether, reducing header parse times a bit in some cases.
[SelectionDAG] Make sure demanded lanes for AND/MUL-by-zero are frozen (#180727)
DAGCombiner can fold a chain of INSERT_VECTOR_ELT into a vector AND/OR
operation. This patch adds protection to avoid that we end up making the
vector more poisonous by freezing the source vector when the elements
that should be set to 0/-1 may be poison in the source vector.
The patch also fixes a bug in SimplifyDemandedVectorElts for
MUL/MULHU/MULHS/AND that could result in making the vector more
poisonous. Problem was that we skipped demanding elements from Op0 that
were known to be zero in Op1. But that could result in elements being
simplified into poison when simplifying Op0, and then the result would
be poison and not zero after the MUL/MULHU/MULHS/AND. The solution is to
defensively make sure that we demand all the elements originally
demanded also when simplifying Op0.
This bugs were found when analysing the miscompiles in
https://github.com/llvm/llvm-project/issues/179448
[6 lines not shown]
[MLIR][ODS] Make dialect attribute helper member functions const (NFC)
This commit marks member functions of dialect attribute helpers as
constant. This ensures that these helpers can be added as members of
rewrite patterns, whose `matchAndRewrite` functions are marked as const
as well.
[libc++][NFC] Use std::quoted in fs::path and remove the private __quoted (#181043)
We've provided `std::filesystem` before C++17 in the past, but we don't
anymore, so we can use `std::quoted`.
[clang][Builtins][ARM] NFC updates in ARM.cpp (#180966)
Updates the logic in `CodeGenFunction::EmitAArch64BuiltinExpr` so that
we always start with the general code and we only fall-back to
specialised cases (i.e. `switch` stmts) for intrinsics for which the
general code does no apply.
BEFORE (only high-level:
```cpp
Value *CodeGenFunction::EmitAArch64BuiltinExpr() {
(...)
/// 1. SWITCH STMT FOR NON-OVERLOADED INTRINSIS
switch (BuiltinID) {
default break:
case NEON::BI__builtin_neon_vabsh_f16:
(...)
}
/// 2. GENERAL CODE
[58 lines not shown]
[DAGCombiner] Fix subvector extraction index for big-endian STLF (#180795)
This PR fixes a big-endian regression in `ForwardStoreValueToDirectLoad`
where the wrong subvector was being extracted. In big-endian, memory
offset 0 corresponds to the high bits, so the extraction index needs to
be adjusted.
As suggested by @KennethHilmersson, calculate the extraction index as
the difference between the number of elements in the intermediate vector
and the load vector when in big-endian mode.
Special thanks to Kenneth Hilmersson for providing the fix logic and the
ARM regression test.
https://github.com/llvm/llvm-project/pull/172523#issuecomment-3878065191https://github.com/llvm/llvm-project/pull/172523#issuecomment-3879575092
[ValueTracking] Extend computeConstantRange for add/sub, sext/zext/trunc
Recursively compute operand ranges for add/sub and propagate ranges
through sext/zext/trunc.
For add/sub, the computed range is intersected with any existing range
from setLimitsForBinOp, and NSW/NUW flags are used via addWithNoWrap/
subWithNoWrap to tighten bounds.
The motivation is to enable further folding of reduce.add expressions
in comparisons, where the result range can be bounded by the input
element ranges.
Compile-time impact on llvm-test-suite is <0.1% mean.
[ValueTracking] Extend computeConstantRange for add/sub and sext/zext/trunc
Recursively compute operand ranges for add/sub and propagate ranges
through sext/zext/trunc.
For add/sub, the computed range is intersected with any existing range
from setLimitsForBinOp, and NSW/NUW flags are used via addWithNoWrap/
subWithNoWrap to tighten bounds.
The motivation is to enable further folding of reduce.add expressions
in comparisons, where the result range can be bounded by the input
element ranges.
Compile-time impact on llvm-test-suite is <0.1% mean.
[mlir][Math] Fix IPowIOp folding crash for i1 (#179684)
Fixes #179380: an assertion/crash in math.ipowi constant folding when the result type is i1.
`IPowIOp::fold` constructed `APInt(width=1, val=1, isSigned=true)`. Signed i1 cannot represent +1 (range [-1, 0]), so APInt asserts (isIntN).
This commit deactivates folding for `i1`.
---------
Co-authored-by: Jakub Kuderski <kubakuderski at gmail.com>
[clang-repl] Fix disambiguation of out-of-line member with private types (#178842)
This patch fixes a bug in clang-repl where out-of-line member function
definitions were incorrectly identified as statements when they involved
private type aliases.
The issue occurred because `isCXXDeclarationStatement` would trigger
immediate access checks during tentative parsing. Since the context of
an out-of-line definition isn't fully established during this phase,
Sema would incorrectly flag private members as inaccessible, causing
the parser to fail the declaration check and fall back to statement
parsing.
Changes:
- In `isCXXDeclarationStatement`, use `TentativeParsingAction` to
ensure the token stream is fully restored.
- Use `SuppressAccessChecks` during the tentative disambiguation phase
to prevent premature access errors.
- Ensure that formal access verification still occurs during the
[5 lines not shown]