[Offload] fix OffloadAPI unittests discovery (#198750)
Commit 3383f0d repointed LIBOMPTARGET_LIBRARY_DIR to a different
runtimes lib dir, but the unit lit config still derived the unittest
binary path from it. Pass the unittest directory explicitly instead.
[X86] Update PSADBW tests to more closely match middle-end vector.reduce.add codegen (#198760)
The middle-end will detect vector.reduce.add patterns - update the
Codegen tests to use the intrinsics directly and add PhaseOrdering tests
to ensure vector.reduce.add intrinsics are created
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[Flang][tests] Add a missing REQUIRES. (#198753)
A newly added test uses `x86_64-unknown-linux-gnu` as a triple, without
a `REQUIRES: x86-registered-target` line, so that it will fail in builds
of LLVM specific to other architectures.
[AArch64][TTI][EarlyCSE] Add support for ld1xN and st1xN intrinsics
Handle ld1x2, ld1x3, ld1x4, st1x2, st1x3, st1x4 in:
- AArch64TTIImpl::getTgtMemIntrinsic
- AArch64TTIImpl::getOrCreateResultFromMemIntrinsic
This enables EarlyCSE to optimize these NEON load/store intrinsics.
To test the changes, a new testcase (intrinsics-1xN.ll) derived from
llvm/test/Transforms/EarlyCSE/AArch64/intrinsics.ll is added.
Revert "[SLP] Support ordered fadd reduction via reduction intrinsics" (#198756)
This caused assertion failures, see discussion on the original PR.
Reverts llvm/llvm-project#189451
[libc] Fix modular printf attributes (#194003)
This fixes the validation error related to modular printf missing format attribute in C++ code by moving the validation after the implicit format attribute is added for builtins and known library functions.
This also adds a simple C++ test since the C code did compile successfully because the implicit attributes were added in time for the validation happening for C code.
Assisted-by: codex, reviewed and cross checked, also tested with ATfE,
by me. Modular printf reduced code size from ~37K to ~13K for int-only
printf sample.
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[CodeGen] Always print 64 bit hash value in MachineBlockHashInfoPrinterPass (#198598)
Hash length must be fixed-size. 0x prefix is counted against the width in
format_hex. Increasing from 16 to 18.
[DAGCombiner] Don't fold cheap extracts of multiple use splats (#134120)
For out-of loop sum reductions, the loop vectorizer will emit an initial
reduction vector like this in the preheader, where there might be an
initial value to begin summing from:
```llvm
%v = insertelement <vscale x 4 x i32> zeroinitializer, i32 %initial, i64 0
```
On RISC-V we currently lower this quite poorly with two splats of 0, one
at m1 and one at m2:
vsetvli a1, zero, e32, m1, ta, ma
vmv.v.i v10, 0
vsetvli zero, zero, e32, m1, tu, ma
vmv.s.x v10, a0
vsetvli a0, zero, e32, m2, ta, ma
vmv.v.i v8, 0
[32 lines not shown]
[LoopInterchange] Drop ninf from instructions involved in interchange (#197923)
Applying loop-interchange can alter the order of operations in reduction
calculations. If these operations involve floating‑point arithmetic, the
results may change as well. If an instruction in the chain has the
`ninf` flag, it means that reordering can produce a poison value, which
may lead to undefined behavior even though the original program is not.
This patch addresses this issue by dropping `ninf` flags from the
instructions involved in the transformation, as discussed in #148851.
Fixes #148851.
[mlir][SPIR-V] Add ISubBorrow canonicalization patterns (#198637)
Mirror the IAddCarry folder, rewrite isubborrow(x, 0) to <x, 0> via
CompositeConstruct, and fold the all-constant case into a single
spirv.Constant struct
[lldb] Update TestDelayedBreakpoint test to use the right setting (#198751)
This test should regardless of which setting is the default for delayed
breakpoints.
[LoopInterchange] Detect unsupported PHIs in inner loop exit block correctly (#194323)
In the legality check phase, `areInnerLoopExitPHIsSupported` inspects
the PHI nodes in the exit block of the inner loop and bail out if
certain unsupported PHI node is found. This functions had several
issues:
- Conflating with the inner loop and the outer loop
- Unnecessarily conservative when LCSSA-chains exist, which will be
handled by `simplifyLCSSA` function
This patch fixes the above issues to detect unsupported PHIs correctly.
Fix #193746