[CodeGenPrepare] Report an error if ProfileSummaryAnalysis is not available (#199268)
CodeGenPreparePass can't declare ProfileSummaryAnalysis as required,
because PSA is a module-level analysis, but CFP is a function-level pass.
Therefore it accesses PSA using getCachedResult, and PSA might be null.
In practice this doesn't happen, because the CGP pass pipeline
preparation code ensures that PSA is present. But if you invoke
CGP via opt -passes=codegenprepare, then it's not
there, and we segfault.
Fix for https://github.com/llvm/llvm-project/issues/173360.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[llvm,clang] Don't assume non-erased DenseMap entries remain valid after erase. NFC (#198982)
In preparation for switching DenseMap from tombstone deletion to
backward-shift deletion, update call sites that reuse an iterator or a
bucket reference after erasing another entry from the same map.
These work under tombstone deletion because unrelated buckets stay put,
but backward-shift deletion relocates entries to close the gap.
Add DenseMap::remove_if, similar to SmallPtrSet::remove_if, as
replacement for erase-while-iterating, and use it where applicable.
Aided by Claude Opus 4.7
[LifetimeSafety] Extend suggestions for `lifetimebound` to also warn on canonical declarations (#198784)
With this patch, we suggest adding the `clang::lifetimebound` attribute
on the canonical declaration and on the earliest redeclaration in each
other file, preserving diagnostics for declarations visible from other
translation units while avoiding duplicate suggestions within the same
file.
Fixes #198624
Fixes #198628
[X86][GISel] Fix carry-in for selectUAddSub. (#199261)
When G_UADDE/G_USUBE was chained off a previous G_UADDE/G_UADDO/
G_USUBE/G_USUBO, selectUAddSub re-materialized EFLAGS.CF from the
previous SETB byte using CMP r, 1. That computes (r - 1) and sets
CF iff r < 1 unsigned, i.e. CF = (r == 0) -- the inverse of the
desired carry. The following ADC/SBB then consumed the wrong CF and
produced an off-by-one upper word; e.g. `add i128 0xFF..FF, 1` under
-global-isel returned hi=0 lo=0 instead of hi=1 lo=0.
Emit NEG r instead: NEG sets CF iff its operand is non-zero, matching
the SETB byte. NEG is a two-address (tied) instruction, so emit it
into a fresh virtual register rather than redefining the carry-in
vreg.
C reproducer (compile on x86_64-linux-gnu and run):
```
// clang -O2 -fglobal-isel repro.c -o repro && ./repro
[32 lines not shown]
[SLP][NFC] Add precommit test for unprofitable ordered fadd reductions (#199428)
Adds a test case reproducing a scenario where the cost model incorrectly
evaluates an unprofitable ordered fadd reduction chain as profitable.
Further details can be found on this issue:
https://github.com/llvm/llvm-project/issues/199267
[libc][math] Implement isnanf16 header-only function (#198115)
Adds `isnanf16` the float16 variant of isnan as part of issue
[#195400](https://github.com/llvm/llvm-project/issues/195400), which
tracks adding missing isnan variants for extended floating-point types.
The implementation follows the same pattern as the existing `isnanf`,
`isnan`, and `isnanl` functions.
---------
Co-authored-by: Victor Campos <github at victorcampos.me>
[VPlan] Simplify block deletion in VPlan dtor (NFC) (#199421)
Split deletion loop into 2 simpler loops: first replace all operands of
each recipe with a dummy value. Then delete blocks in second pass.
This avoids RAUW unnecessarily and also removes the need to handle
region values explicitly.
[libc++] remove duplicate assertions for void/reference const any_cast
For test cases of the const overload of any_cast, such as:
```C++
void test() {
std::any a = 0;
const std::any& a2 = a;
(void)std::any_cast<int&>(&a2);
}
```
(And similarly for void).
The problem is that the assertions are implemented both in the const and non-const any_cast overloads,
but since the const overload delegates to the non-const overload, that ends up producing the same assertion twice.
workflows/issue-release-workflow: Validate user input in /cherry-pick commands (#199249)
This protects against mailicious inputs embedded in comments with
/cherry-pick commands.
[offload] Fix --libomptarget-nvptx-bc-path in tests (#199382)
PR #198622, which landed as 3383f0d6fe01, causes 272 `libomptarget ::
nvptx64-nvidia-cuda` test fails on my system with:
```
clang: error: bitcode library '/home/jdenny/llvm/build/\./lib/x86_64-unknown-linux-gnu/nvptx64-nvidia-cuda' does not exist
```
This patch fixes that.
[MC] Create new MCScheduleOptions cl::opt category (#198746)
This patch creates a new cl::opt category for MCSchedule options. It
enables tools to filter MCSchedule options based on category.
Specifically, llvm-mca now filters them in, and displays them under
`--help-hidden`, which wasnt the case before.
[InstCombine] Fix vector_reduce_mul(sext <n x i1>) for odd n. (#199401)
Before this patch, instcombine folded
vector_reduce_mul(sext (<n x i1> val))
to
zext(vector_reduce_and(<n x i1> val)).
But this is incorrect when n is odd: The result of the reduction is -1,
not 1.
After this patch we only do this fold when n is even.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[ConstantFolding] Handle large exponents in ldexp (#199309)
Previously if you passed a constant exponent to llvm.ldexp greater than
the width of `int`, we would silently truncate it to `int` before
using it in scalbn. We'd thus generate the incorrect result.
We now clamp it to fit within int.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[VPlan] Remove special cost logic for loads predicated by header mask. (#196630)
Remove the special cost logic for loads predicated by the header mask,
as it does not accurately reflect the cost of the generated VPlan.
Unmasking the load can only be done in general if we don't unroll or if
the address is actually uniform-across-vf-and-uf. The former we cannot
really determine before selecting the VF as UF is picked after VF. The
latter is not really useful in practice.
PR: https://github.com/llvm/llvm-project/pull/196630
[libc][NFC] Make LIBC_MATH safer and some minor improvements for floating point exception tests. (#199392)
- Wrap LIBC_MATH usages inside parentheses
- Skip clearing exceptions when not needed.
- Skip FE_INEXACT when testing FE_UNDERFLOW / FE_OVERFLOW for basic ops.
[X86] LowerFLDEXP: convert widened int exponent to FP before SCALEF (#199263)
For vector ldexp cases that LowerFLDEXP implements by widening to a
512-bit SCALEF operation, the code widened both X and Exp but passed
the widened integer exponent directly to SCALEF, which interprets
its inputs as IEEE-754 floats.
Convert the widened integer exponent to FP and pass that to SCALEF.
Reproducer (clang -O2 -mavx512f repro.c -o repro && ./repro):
```
#include <stdio.h>
typedef float v4f __attribute__((vector_size(16)));
typedef int v4i __attribute__((vector_size(16)));
__attribute__((noinline))
v4f ldexp_v4(v4f x, v4i e) {
return __builtin_elementwise_ldexp(x, e);
[22 lines not shown]
[compiler-rt] Use `size_t` rather than `int` for first argument to `__atomic_load_c` et al. (#197519)
I noticed this discrepancy in emscripten when trying to test 128 bit
atomics under wasm64:
https://github.com/emscripten-core/emscripten/pull/26937
The LLVM CodeGen appears to use `size_t` in this position when it
generates calls to these functions.
This doesn't effect other platforms I imagine because they don't require
signature checking at the linker level.
This doesn't effect wasm32 where size_t and int are the same size.
[libc++] Remove AppleClang woraround for __builtin_verbose_trap (#199171)
We've dropped support for AppleClang versions with a different
`__builtin_verbose_trap`, so we can remove the workaround.