[MC] Create new MCScheduleOptions cl::opt category (#198746)
This patch creates a new cl::opt category for MCSchedule options. It
enables tools to filter MCSchedule options based on category.
Specifically, llvm-mca now filters them in, and displays them under
`--help-hidden`, which wasnt the case before.
[InstCombine] Fix vector_reduce_mul(sext <n x i1>) for odd n. (#199401)
Before this patch, instcombine folded
vector_reduce_mul(sext (<n x i1> val))
to
zext(vector_reduce_and(<n x i1> val)).
But this is incorrect when n is odd: The result of the reduction is -1,
not 1.
After this patch we only do this fold when n is even.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[ConstantFolding] Handle large exponents in ldexp (#199309)
Previously if you passed a constant exponent to llvm.ldexp greater than
the width of `int`, we would silently truncate it to `int` before
using it in scalbn. We'd thus generate the incorrect result.
We now clamp it to fit within int.
This bug was found by a large run of Opus 4.7 looking for bugs in LLVM.
[VPlan] Remove special cost logic for loads predicated by header mask. (#196630)
Remove the special cost logic for loads predicated by the header mask,
as it does not accurately reflect the cost of the generated VPlan.
Unmasking the load can only be done in general if we don't unroll or if
the address is actually uniform-across-vf-and-uf. The former we cannot
really determine before selecting the VF as UF is picked after VF. The
latter is not really useful in practice.
PR: https://github.com/llvm/llvm-project/pull/196630
[libc][NFC] Make LIBC_MATH safer and some minor improvements for floating point exception tests. (#199392)
- Wrap LIBC_MATH usages inside parentheses
- Skip clearing exceptions when not needed.
- Skip FE_INEXACT when testing FE_UNDERFLOW / FE_OVERFLOW for basic ops.
[X86] LowerFLDEXP: convert widened int exponent to FP before SCALEF (#199263)
For vector ldexp cases that LowerFLDEXP implements by widening to a
512-bit SCALEF operation, the code widened both X and Exp but passed
the widened integer exponent directly to SCALEF, which interprets
its inputs as IEEE-754 floats.
Convert the widened integer exponent to FP and pass that to SCALEF.
Reproducer (clang -O2 -mavx512f repro.c -o repro && ./repro):
```
#include <stdio.h>
typedef float v4f __attribute__((vector_size(16)));
typedef int v4i __attribute__((vector_size(16)));
__attribute__((noinline))
v4f ldexp_v4(v4f x, v4i e) {
return __builtin_elementwise_ldexp(x, e);
[22 lines not shown]
[compiler-rt] Use `size_t` rather than `int` for first argument to `__atomic_load_c` et al. (#197519)
I noticed this discrepancy in emscripten when trying to test 128 bit
atomics under wasm64:
https://github.com/emscripten-core/emscripten/pull/26937
The LLVM CodeGen appears to use `size_t` in this position when it
generates calls to these functions.
This doesn't effect other platforms I imagine because they don't require
signature checking at the linker level.
This doesn't effect wasm32 where size_t and int are the same size.
[libc++] Remove AppleClang woraround for __builtin_verbose_trap (#199171)
We've dropped support for AppleClang versions with a different
`__builtin_verbose_trap`, so we can remove the workaround.
Revert "[AIX] Remove unsupported AIX native echo option -n (llvm#199079)" (#199277)
This reverts commit 593eb2066293c8636786c98cb696c533da9b97ca.
The patch is being reverted as the code changes and the commit message
and description do not match and point to a previous implementation
Co-authored-by: himadhith <himadhith.v at ibm.com>
Add --fn flag to llvm-lit to inject select-function pass into opt pipelines
Translates --fn=fn0,fn1 into -passes='select-function<fn=fn0;fn=fn1>,...'
by rewriting -passes= arguments in RUN lines after substitution.
Handles both single and double quoted pass pipelines.
Add select-function pass to keep only specified functions and their dependencies
Chains InternalizePass, GlobalDCEPass, and StripDeadPrototypesPass to
remove everything not transitively reachable from the selected functions.
Supports multiple roots via select-function<fn=foo;fn=bar>.
[offload] Fix --libomptarget-nvptx-bc-path in tests
PR #198622, which landed as 3383f0d6fe01, causes 272 `libomptarget ::
nvptx64-nvidia-cuda` test fails on my system with:
```
clang: error: bitcode library '/home/jdenny/llvm/build/\./lib/x86_64-unknown-linux-gnu/nvptx64-nvidia-cuda' does not exist
```
This patch fixes that.
[RISCV] Reserve all sub-registers of user reserved GPRs (#199302)
When a GPR is reserved by the user (e.g., via `-mattr=+reserve-x27`)
or marked as constant, only the top-level register was being marked
reserved in `RISCVRegisterInfo::getReservedRegs`. Its sub-registers
(`X27_W` and `X27_H`) remained unreserved.
This broke `LiveIntervals` when register pressure tracking was enabled
by #115445. Because the sub-registers were not reserved, the register
unit was considered non-reserved, causing `LiveIntervals` to track its
liveness and crash in the Machine Verifier due to the reserved
register missing from basic block live-in lists.
Instead, we should ensure that reserving a register also reserves all
of its sub-registers, so that the register unit is correctly
identified as reserved and ignored by `LiveIntervals`.
Fixes #176227
[VPlan] Create casts before ComputeReductionResult (NFC). (#199372)
This ensures ComputeReductionResult is created with operands that have
their correct types set at construction.
[LV] Don't add stride SCEV predicates when runtime checks are disabled. (#199370)
Don't pass symbolic strides to getPtrStride if SCEV runtime checks are
not allowed (e.g. because optimizing for size). This prevents
getPtrStride from adding additional SCEV checks for symbolic strides.
[VPlan] Thread types through VPHeaderPHIRecipe and VPDerivedIVRecipe (NFC) (#195894)
Update VPHeaderPHIRecipe and VPDerivedIVRecipe to set the scalar types
for their defined values.
This requires updating addReductionResultComputation to construct the
new chain for AnyOf reductions up-front.
Depends on https://github.com/llvm/llvm-project/pull/195891
PR: https://github.com/llvm/llvm-project/pull/195894