LLVM/project c3a884cflang/test/Driver intrinsic-module-path.f90, flang/test/Driver/Inputs ieee_arithmetic.mod iso_fortran_env.mod

[Flang][bbc] Add support for -fintrinsic-module-path
DeltaFile
+27-13flang/test/Driver/intrinsic-module-path.f90
+0-8flang/test/Driver/Inputs/ieee_arithmetic.mod
+0-8flang/test/Driver/Inputs/iso_fortran_env.mod
+5-0flang/tools/bbc/bbc.cpp
+32-294 files

LLVM/project 12470b3mlir/include/mlir/Dialect/XeGPU/IR XeGPUOps.td, mlir/lib/Dialect/XeGPU/Transforms XeGPUSgToWiDistributeExperimental.cpp XeGPULayoutImpl.cpp

[MLIR][XeGPU] Improve deinterleave/interleave/dpas_mx ops handling (#197223)
DeltaFile
+156-2mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToWiDistributeExperimental.cpp
+80-4mlir/test/Dialect/XeGPU/sg-to-wi-experimental-unit.mlir
+24-14mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+13-13mlir/test/Dialect/XeGPU/invalid.mlir
+4-7mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+5-5mlir/test/Dialect/XeGPU/propagate-layout.mlir
+282-4510 files not shown
+313-7116 files

LLVM/project 868bf3futils/bazel/llvm-project-overlay/libc BUILD.bazel

[bazel][libc] Fix 8076d17b61028e7fd5723fa84fd5615c945ae46b (#198553)

Add new targets & deps for syscall wrappers
DeltaFile
+125-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+125-01 files

LLVM/project 5ff6c5ellvm/lib/Target/AMDGPU SIFoldOperands.cpp, llvm/test/CodeGen/AMDGPU si-fold-scalar-add-sub-imm.mir dagcombine-reassociate-multi-memop.ll

[AMDGPU] SIFoldOperands: constant-fold S_ADD/S_SUB with immediate operands (#198410)

Extend SIFoldOperands::tryConstantFoldOp to recognise three patterns
  * ADD/SUB(imm, imm) -> S_MOV_B32 (LHS +/- RHS)
  * ADD x, 0          -> COPY x   (Also `0 + x`)
  * SUB x, 0          -> COPY x   (SUB is not commutable)

Assisted-by: Claude Opus 4.7
DeltaFile
+161-0llvm/test/CodeGen/AMDGPU/si-fold-scalar-add-sub-imm.mir
+30-0llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+1-1llvm/test/CodeGen/AMDGPU/dagcombine-reassociate-multi-memop.ll
+192-13 files

LLVM/project 8148e16llvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 gfni-operand-and-fold.ll

[X86] Fold splat AND on VGF2P8AFFINEQB source (#193364)

Given that each row within `vgf2p8affineqb`'s matrix controls which
source bits are selected, zeroing the same bit within all rows treats
that corresponding source bit like it is zero. This means a AND of the
input by any splatted 8-bit values can be folded with the matrix. This
is patch:

- Can eliminate a constant and/or reduces the instruction count from 2
to 1.
- Only occurs when the matrix is constant, ensuring that it can't
increase the dependency chain.
- Don't occur if the AND is multi use well the splat isn't constant,
preventing additional operations.
- Work with both constant 8-bit splats and scalars value that where
splatted to a vector.
- Includes test coverage for positive cases (by constants, variable
scalars, non zero immediates) and negative (multi use, larger splats,
variable matrices).

Fixes #191325
DeltaFile
+179-0llvm/test/CodeGen/X86/gfni-operand-and-fold.ll
+60-0llvm/lib/Target/X86/X86ISelLowering.cpp
+239-02 files

LLVM/project ce7b6edclang/lib/Driver/ToolChains CommonArgs.cpp, clang/lib/Driver/ToolChains/Arch AMDGPU.cpp

[NFC][AMDGPU] Remove AMDGPU related code from generic TargetParser.cpp
DeltaFile
+659-0llvm/lib/TargetParser/AMDGPUTargetParser.cpp
+1-643llvm/lib/TargetParser/TargetParser.cpp
+109-0llvm/include/llvm/TargetParser/AMDGPUTargetParser.h
+1-90llvm/include/llvm/TargetParser/TargetParser.h
+0-2clang/lib/Driver/ToolChains/Arch/AMDGPU.cpp
+1-1clang/lib/Driver/ToolChains/CommonArgs.cpp
+771-73627 files not shown
+797-76033 files

LLVM/project e0f8b79clang/test/CodeGen link-builtin-bitcode.c, clang/test/CodeGenOpenCL builtins-amdgcn.cl

[AMDGPU] Add three target features msad-insts, mqsad-pk-insts, and mqsad-insts (#198432)
DeltaFile
+43-7llvm/lib/Target/AMDGPU/AMDGPU.td
+29-0llvm/lib/TargetParser/TargetParser.cpp
+8-10clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+5-7llvm/lib/Target/AMDGPU/VOP3Instructions.td
+11-0llvm/test/MC/AMDGPU/gfx12_5_generic_asm_vop3_err.s
+3-3clang/test/CodeGen/link-builtin-bitcode.c
+99-276 files not shown
+113-3512 files

LLVM/project fb4686bmlir/include/mlir/Interfaces MemorySlotInterfaces.td, mlir/lib/Transforms Mem2Reg.cpp

[mlir][mem2reg] fix 197158 by moving visitReplacedValues call
DeltaFile
+45-19mlir/lib/Transforms/Mem2Reg.cpp
+16-16mlir/include/mlir/Interfaces/MemorySlotInterfaces.td
+25-0mlir/test/Dialect/LLVMIR/mem2reg-dbginfo.mlir
+86-353 files

LLVM/project af0b42dlibc/src/__support/OSUtil/linux/syscall_wrappers link.h open.h

[libc] prefer *at syscalls in sys/stat wrappers (#197940)

- These changes flips the #ifdef order to prefer the *at syscalls over
normal ones.
- In modern architectures, *at system calls are preferred over normal
system calls cuz of safety issues.
- So by checking for ""*at"" system calls first, we ensure better
compatibility with modern systems.
- After then normal syscalls moved else or elif for support to older
ones.
  - From merged pr(#195792) and issue(#195620)

---------

Signed-off-by: udaykiriti <udaykiriti624 at gmail.com>
Co-authored-by: Jeff Bailey <jbailey at raspberryginger.com>
DeltaFile
+4-4libc/src/__support/OSUtil/linux/syscall_wrappers/link.h
+3-3libc/src/__support/OSUtil/linux/syscall_wrappers/open.h
+3-3libc/src/__support/OSUtil/linux/syscall_wrappers/access.h
+3-3libc/src/__support/OSUtil/linux/syscall_wrappers/readlink.h
+3-3libc/src/__support/OSUtil/linux/syscall_wrappers/rmdir.h
+3-3libc/src/__support/OSUtil/linux/syscall_wrappers/unlink.h
+19-196 files

LLVM/project e04895fllvm/test/CodeGen/X86 avgfloors.ll vector-shift-ashr-sub128.ll

[X86] Lower vector i8 ashr-by-1 using pavgb (#198487)

For vector i8 arithmetic shift right by 1, the current lowering produces
a 5-instruction sequence (psrlw + pand + xor + psubb plus a constant
load) with a 4-deep dependency chain.

This patch uses the identity

  ashr(x, 1) == avgceilu(x, -1) ^ (~x & 0x80)

to lower to ISD::AVGCEILU + a short fixup, producing 4 instructions on
SSE/AVX/AVX2 and 3 on AVX-512BW (after vpternlogd fusion of the AND/XOR
pair), with two parallel dependency chains instead of one long one.

The freeze on R is required because the target reads it twice, matching
the pattern of the existing `shl R, 1 -> add R, R` case in
LowerShiftByScalarImmediate.

Alive2 proof: https://alive2.llvm.org/ce/z/LbXPhE

Fixes #198061
DeltaFile
+258-288llvm/test/CodeGen/X86/avgfloors.ll
+69-135llvm/test/CodeGen/X86/vector-shift-ashr-sub128.ll
+34-57llvm/test/CodeGen/X86/vector-shift-ashr-256.ll
+28-44llvm/test/CodeGen/X86/vector_splat-const-shift-of-constmasked.ll
+23-45llvm/test/CodeGen/X86/vector-shift-ashr-128.ll
+8-15llvm/test/CodeGen/X86/vector-shift-ashr-512.ll
+420-5841 files not shown
+430-5847 files

LLVM/project 6fd09a5flang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP declare-simd-interface-body.f90

[flang][OpenMP] Skip declare simd lowering for interface bodies (#197010)

When DECLARE SIMD appears in the specification part of an interface
body, the PFT records the directive as an evaluation of the enclosing
program unit rather than of the interface body's subprogram. Its clause
operands (linear/aligned/uniform) reference dummy arguments local to the
interface body, which have no address in the enclosing scope, causing a
crash.

Detect the mismatch by comparing the program unit containing the
directive with the procedure currently being lowered, and skip op
emission when they differ.

This handles both explicit declare simd(proc-name) and implicit forms in
any enclosing context.

Fixes #192581
DeltaFile
+32-0flang/test/Lower/OpenMP/declare-simd-interface-body.f90
+23-0flang/lib/Lower/OpenMP/OpenMP.cpp
+55-02 files

LLVM/project 9146776flang/test/Integration/OpenMP atomic-compare.f90, llvm/lib/Frontend/OpenMP OMPIRBuilder.cpp

[Flang] [OpenMP] atomic compare (#184761)

Support for `omp atomic compare` in flang. 
Multiple clauses like capture with compare are not supported

An issue for this was raised earlier at
[181116](https://github.com/llvm/llvm-project/issues/181116)

---------

Co-authored-by: Sunil Kuravinakop <kuravina at pe31.hpc.amslabs.hpecorp.net>
DeltaFile
+517-0mlir/test/Dialect/OpenMP/invalid.mlir
+359-0mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+233-62llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+249-0flang/test/Integration/OpenMP/atomic-compare.f90
+209-0mlir/test/Target/LLVMIR/openmp-llvm.mlir
+171-0mlir/include/mlir/Dialect/OpenACCMPCommon/Interfaces/AtomicInterfaces.td
+1,738-6213 files not shown
+2,285-13119 files

LLVM/project e32fc5dllvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp

[SelectionDAG] Use getExtractSubvector. NFC (#198450)
DeltaFile
+1-2llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+1-21 files

LLVM/project 4b358b2llvm/test/CodeGen/AArch64 atomic-ops-lse.ll cheap-as-a-move-MOVaddr.ll

[AArch64] Consider MOVaddr* as cheap if fuse-adrp-add
DeltaFile
+88-88llvm/test/CodeGen/AArch64/atomic-ops-lse.ll
+55-0llvm/test/CodeGen/AArch64/cheap-as-a-move-MOVaddr.ll
+15-9llvm/test/CodeGen/AArch64/machine-outliner-loh.ll
+9-9llvm/test/CodeGen/AArch64/memcmp.ll
+8-8llvm/test/CodeGen/AArch64/atomic-ops.ll
+6-6llvm/test/CodeGen/AArch64/cgdata-outline-gvar.ll
+181-1206 files not shown
+201-12912 files

LLVM/project c9c4a3bllvm/test/CodeGen/Thumb2 mve-satmul-loops.ll mve-fpclamptosat_vec.ll, llvm/test/CodeGen/Thumb2/LowOverheadLoops fast-fp-loops.ll

[RegisterCoalescer] Don't remat trivial defs without a size benefit

isAsCheapAsAMove doesn't imply "one machine instruction". AArch64 marks
multi-instruction pseudos cheap when their fused latency matches a real
move (MOVaddr = adrp+add, MOVi64imm = MOVZ+MOVK). The trivial remat
duplicates such defs at every COPY use.
DeltaFile
+257-256llvm/test/CodeGen/Thumb2/mve-satmul-loops.ll
+98-98llvm/test/CodeGen/Thumb2/mve-fpclamptosat_vec.ll
+61-61llvm/test/CodeGen/Thumb2/mve-fptosi-sat-vector.ll
+52-55llvm/test/CodeGen/Thumb2/mve-scmp.ll
+54-53llvm/test/CodeGen/Thumb2/mve-fptoui-sat-vector.ll
+51-50llvm/test/CodeGen/Thumb2/LowOverheadLoops/fast-fp-loops.ll
+573-57337 files not shown
+932-86443 files

LLVM/project 9ccaf83llvm/include/llvm/IR OptBisect.h

[LLVM] Add a function to reset the opt bisector (#197723)

For daemonized testing, we need to be able to reset the global opt
bisector between test runs. This PR just adds a small function to the
OptPassGate class to reset its state.
DeltaFile
+9-0llvm/include/llvm/IR/OptBisect.h
+9-01 files

LLVM/project 8f6ed9flldb/source/Target Target.cpp

[lldb] Fix possible invalidated iterator. (#198482)

The begin or end interator may be invalidated when a idx_pos in erased
from the vector.

Unblocks sanitised CI.
DeltaFile
+3-4lldb/source/Target/Target.cpp
+3-41 files

LLVM/project 38949dbllvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64Arm64ECCallLowering.cpp, llvm/test/CodeGen/AArch64 arm64ec-exit-thunks.ll arm64ec-hybrid-patchable.ll

Revert "[AArch64] Copy x4/x5 vararg payload into the x64 stack in Arm64EC exit thunks" (#198540)

Reverts llvm/llvm-project#190933

Reported issues with an EXPENSIVE_CHECKS_BUILD. Reverting so this can be
fixed without undue time pressure.
DeltaFile
+6-208llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
+4-62llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+9-11llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
+1-9llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+20-2904 files

LLVM/project 960ae6fclang/include/clang/Options FlangOptions.td, flang/include/flang/Frontend FrontendOptions.h

[Flang] Adding -ffree-line-length-<value> flag (#192941)

Added support for the `-ffree-line-length-<value>` flag in Flang, which
is equivalent to `-ffixed-line-length-<value>` but in free form.
This flag is supported by gfortran and can be used in some applications.

---------

Co-authored-by: Tarun Prabhu <tarunprabhu at gmail.com>
Co-authored-by: Andre Kuhlenschmidt <andre.kuhlenschmidt at gmail.com>
DeltaFile
+48-0flang/test/Driver/ffree-line-length.f90
+16-9flang/lib/Frontend/CompilerInvocation.cpp
+8-2flang/include/flang/Frontend/FrontendOptions.h
+5-1clang/include/clang/Options/FlangOptions.td
+5-0flang/lib/Parser/prescan.h
+4-0flang/lib/Parser/prescan.cpp
+86-124 files not shown
+90-1310 files

LLVM/project 4cb4bbcclang/include/clang/Lex Lexer.h, clang/lib/Format FormatTokenLexer.cpp

[Clang] [NFC] Use `unique_ptr<Lexer>` everywhere (#198393)

Replace every instance of `new Lexer` with `make_unique<Lexer>` and
adjust `Lexer::Create_PragmaLexer()` to return a `std::unique_ptr<Lexer>` 
instead.

The Preprocessor was already storing a `unique_ptr<Lexer>`, so there’s
no need to change how that works.
DeltaFile
+7-6clang/lib/Lex/PPLexerChange.cpp
+5-6clang/lib/Lex/Lexer.cpp
+5-5clang/include/clang/Lex/Lexer.h
+5-3clang/lib/Format/FormatTokenLexer.cpp
+3-3clang/lib/Lex/Pragma.cpp
+2-2clang/lib/Frontend/FrontendAction.cpp
+27-251 files not shown
+29-267 files

LLVM/project f6fb8f5lldb/test/API/functionalities/gdb_remote_client TestGDBRemoteClient.py

[lldb][windows] remove path separator replacement from TestGDBRemoteClient.py (#198537)

Since https://github.com/llvm/llvm-project/pull/197942, vRun packets use
the native path separators. TestGDBRemoteClient.py now fails on Windows
because it converts the path to POSIX style paths, which is a workaround
for what https://github.com/llvm/llvm-project/pull/197942 fixed.

rdar://177342572
DeltaFile
+3-6lldb/test/API/functionalities/gdb_remote_client/TestGDBRemoteClient.py
+3-61 files

LLVM/project 673b17ellvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 arm64-neon-v1i1-setcc.ll

[DAG] scalarizeExtractedBinOp - extract from non-constant one use buildvectors (#198013)

When attempting to scalarize a vector binop that has a single extract,
we currently only fold if either of the binop's operands is a constant
buildvector - but we can extract from non-constant buildvectors without
increasing instruction count as long as the vector binop was the only
use of the buildvector.

More yak shaving for #196493
DeltaFile
+44-60llvm/test/CodeGen/X86/ifma-combine-vpmadd52.ll
+25-27llvm/test/CodeGen/X86/masked_gather_scatter_widen.ll
+2-6llvm/test/CodeGen/X86/i128-add.ll
+3-4llvm/test/CodeGen/X86/known-signbits-vector.ll
+1-2llvm/test/CodeGen/AArch64/arm64-neon-v1i1-setcc.ll
+2-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+77-1006 files

LLVM/project bcfa53eflang/include/flang/Lower OpenACC.h, flang/lib/Lower OpenACC.cpp Bridge.cpp

[flang][acc] Handle Fortran do loops as acc loops in acc routine (#198420)

As was previously done for do loops in acc compute constructs in
https://github.com/llvm/llvm-project/issues/149614 , this PR does the
same for do loops in `acc routine`. The rules are follows:
- Do loops not marked with `acc loop` are considered `auto`
- Do concurrent loops are considered `independent`
- Any loops in an `acc routine seq` are considered `seq`

This ensures that the IV is correctly privatized and attached to acc
loop.
DeltaFile
+108-1flang/test/Lower/OpenACC/do-loops-to-acc-loops.f90
+81-24flang/lib/Lower/OpenACC.cpp
+8-0flang/include/flang/Lower/OpenACC.h
+4-3flang/lib/Lower/Bridge.cpp
+201-284 files

LLVM/project 75e4aafllvm/lib/CodeGen ShadowStackGCLowering.cpp, llvm/test/CodeGen/Generic shadow-stack-gc-lowering.ll

Reland "[CodeGen] Use byte offsets and ptradd in ShadowStackGCLowering" (#197436)

Replace typed struct GEPs with byte array allocation and ptradd
operations:

1. Track root offsets as byte offsets instead of building typed struct.
2. Use `ComputeFrameLayout` to compute byte offsets based on DataLayout,
properly accounting for each root's size and alignment.
3. Allocate frame as `[FrameSize x i8]` byte array instead of typed
struct.
4. Replace all CreateGEP operations with CreatePtrAdd using computed
offsets.
5. Frame layout unchanged: `[Next ptr | Map ptr | Root 0 | Root 1 | ...
| Root N]` where each root is placed at its computed aligned offset.
6. Zero out padding between roots with memset for deterministic frame
contents for GC.

Benefits:
- Removes dependency on `getAllocatedType` for building frame struct

    [7 lines not shown]
DeltaFile
+101-86llvm/lib/CodeGen/ShadowStackGCLowering.cpp
+30-44llvm/test/CodeGen/Generic/shadow-stack-gc-lowering.ll
+131-1302 files

LLVM/project 52ca170.github/workflows/containers/github-action-ci-tooling Dockerfile

[Github] Hashpin base container in CI Tooling containerfile (#197315)

https://github.com/llvm/llvm-project/security/code-scanning/1492
DeltaFile
+2-2.github/workflows/containers/github-action-ci-tooling/Dockerfile
+2-21 files

LLVM/project 213b329llvm/lib/Target/AMDGPU GCNSubtarget.h AMDGPUAsmPrinter.cpp, llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp

[AMDGPU][NFCI] Change MCSubtargetInfo references in AMDGPUBaseInfo.h/.cpp to be const ref instead of pointers (#197038)

Change all `AMDGPU::IsaInfo` functions and `initDefaultAMDKernelCodeT`
to take `const MCSubtargetInfo &` instead of `const MCSubtargetInfo *`.
These functions never accept null, so a reference better expresses the
contract.

Also change `AMDGPUMCKernelCodeT::initDefault` to take a const reference
for consistency, and convert local `MCSubtargetInfo` pointer variables
to references in `AMDGPUMCExpr.cpp` where the pointer is always
dereferenced.

Requested by @arsenm in
https://github.com/llvm/llvm-project/pull/192306#discussion_r2076113671.

Co-authored-by: Claude Opus 4 (1M context) <noreply at anthropic.com>
DeltaFile
+72-72llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+30-30llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+17-17llvm/lib/Target/AMDGPU/GCNSubtarget.h
+8-8llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+5-6llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+4-5llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+136-1388 files not shown
+153-15514 files

LLVM/project 94b1d19llvm/lib/Transforms/Utils Local.cpp, llvm/test/DebugInfo/Generic dbg-value-lower-linenos.ll

[Utils] Examine debug info type instead of alloca type to guess the debug behavior of the alloca uses (#177480)

Replace `isArray` and `isStructure` helpers that queried alloca IR type
with a `isCompositeType` helper that checks the debug variable's
source-level type from debug info metadata to decide if this seems
perhaps profitable to convert to this debug info from #debug_declare to
a #debug_value.

This changes behavior: the lowering decision is now based on the
source-level type from debug info rather than the IR alloca type, which
is more semantically correct for debug info processing. This should
have minimal effect on clang, but may change behavior more
significantly on front-ends like rust that have not used semantically
meaningful alloca element types.

Removes all uses of getAllocatedType() from Utils/Local.cpp.

This seemed slightly more semantically correct to me, though it is
slightly challenging to enumerate all of the possible scalar debug

    [7 lines not shown]
DeltaFile
+35-9llvm/lib/Transforms/Utils/Local.cpp
+4-3llvm/test/DebugInfo/Generic/dbg-value-lower-linenos.ll
+2-2llvm/test/Transforms/InstCombine/dbg-simplify-alloca-size.ll
+1-1llvm/test/Transforms/InstCombine/dbg-scalable-store-fixed-frag.ll
+42-154 files

LLVM/project 7d6ed54llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer alternate-non-profitable.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+74-48llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+24-24llvm/test/Transforms/SLPVectorizer/RISCV/buildvector-all-external-scalars.ll
+6-6llvm/test/Transforms/SLPVectorizer/X86/pr48879-sroa.ll
+4-4llvm/test/Transforms/SLPVectorizer/X86/vect_copyable_in_binops.ll
+3-3llvm/test/Transforms/SLPVectorizer/X86/copyable_reorder.ll
+3-3llvm/test/Transforms/SLPVectorizer/alternate-non-profitable.ll
+114-8815 files not shown
+135-10921 files

LLVM/project b7cc800llvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize select-cmp-predicated.ll

[VPlan] Simplify select x, (i1 y | z), y -> y | (x && z) (#190196)

Fixes https://github.com/llvm/llvm-project/issues/189553

This adds a canonicalization `select x, (i1 y | z), y -> y | (x && z)`,
[Alive2]( https://alive2.llvm.org/ce/z/qcQRn6). InstCombine already
performs this.

This adds a canonicalization which causes the `lhs | (headermask && rhs)
-> vp.merge rhs, true, lhs, evl` pattern in optimizeMasksToEVL to match,
improving the RISC-V codegen for an anyof select reduction.
DeltaFile
+7-7llvm/test/Transforms/LoopVectorize/select-cmp-predicated.ll
+9-0llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+2-3llvm/test/Transforms/LoopVectorize/RISCV/select-cmp-reduction.ll
+2-2llvm/test/Transforms/LoopVectorize/AArch64/sve-select-cmp.ll
+20-124 files

LLVM/project 29f345ellvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64Arm64ECCallLowering.cpp, llvm/test/CodeGen/AArch64 arm64ec-exit-thunks.ll arm64ec-hybrid-patchable.ll

Revert "[AArch64] Copy x4/x5 vararg payload into the x64 stack in Arm64EC exi…"

This reverts commit e6a12781bcc2d1713f9e5593de36f68cc00aaab6.
DeltaFile
+6-208llvm/test/CodeGen/AArch64/arm64ec-exit-thunks.ll
+4-62llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+9-11llvm/test/CodeGen/AArch64/arm64ec-hybrid-patchable.ll
+1-9llvm/lib/Target/AArch64/AArch64Arm64ECCallLowering.cpp
+20-2904 files