LLVM/project a068c90flang/include/flang/Optimizer/Builder HLFIRTools.h, flang/lib/Lower ConvertArrayConstructor.cpp ConvertCall.cpp

[flang][HLFIR] Fix crash in WHERE with exactly_once inside elemental (#194443)

Fix a segfault in LowerHLFIROrderedAssignments when compiling a WHERE
statement whose mask contains an array constructor with an implied-do
loop (e.g. WHERE([(f(J), J=1,N)]) ...). The hlfir.exactly_once op inside
the hlfir.elemental has live-in values that are block arguments from the
enclosing elemental, which canonicalizeExactlyOnceInsideWhere cannot
pull into the exactly_once region.

The fix has two parts:

1. In canonicalizeExactlyOnceInsideWhere, skip exactly_once ops nested
inside hlfir.elemental and skip block argument live-ins, since these
cannot be relocated.

2. In both overloads of inlineElementalOp, handle hlfir.exactly_once by
inlining its body and cleanup operations instead of cloning the op
verbatim (which left an illegal op after lowering).


    [6 lines not shown]
DeltaFile
+22-0flang/test/Lower/HLFIR/where-implied-do-mask.f90
+1-13flang/lib/Lower/ConvertArrayConstructor.cpp
+13-0flang/lib/Optimizer/Builder/HLFIRTools.cpp
+1-7flang/lib/Lower/ConvertCall.cpp
+4-0flang/include/flang/Optimizer/Builder/HLFIRTools.h
+41-205 files

LLVM/project df65c8blldb/source/Plugins/ExpressionParser/Clang InjectPointerSigningFixups.cpp

[lldb] Handle ConstantExpr constants in InjectPointerSigningFixupCode (#194476)

Currently, the injection code assumes we encounter ConstantAggregate
constants and emits a GEP to access the fields/members. However, it's
possible for a CPA to be an operand of a ConstantExpr (e.g. a bitcast).
Emitting a GEP in that scenario doesn't make sense. This should instead
be handled by keeping track of the path to the CPA (which operands need
to be followed from the top-level ConstantExpr).

After this change, most arm64e tests that crash LLDB either pass or fail
with some other issue. The main exception is TestWeakSymbols.py which
needs more work.
DeltaFile
+69-31lldb/source/Plugins/ExpressionParser/Clang/InjectPointerSigningFixups.cpp
+69-311 files

LLVM/project 7e13df6llvm/utils/gn/secondary/clang/unittests/Lex BUILD.gn

[gn build] Port 1e10f9a82222 (#194919)
DeltaFile
+1-0llvm/utils/gn/secondary/clang/unittests/Lex/BUILD.gn
+1-01 files

LLVM/project 12c9ffbllvm/utils/gn/secondary/lldb/source/Host BUILD.gn

[gn build] Port 6617aac292a1 (#194920)
DeltaFile
+1-0llvm/utils/gn/secondary/lldb/source/Host/BUILD.gn
+1-01 files

LLVM/project f0af59allvm/utils/gn/secondary/llvm/lib/ProfileData BUILD.gn

[gn build] Port 69c38be83991 (#194921)
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/ProfileData/BUILD.gn
+1-01 files

LLVM/project 5776682llvm/utils/gn/secondary/llvm/test BUILD.gn

[gn] port 69c38be83991374 (#194918)
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/test/BUILD.gn
+1-01 files

LLVM/project e328845mlir/include/mlir/Dialect/XeGPU/IR XeGPUAttrs.td, mlir/lib/Dialect/XeGPU/Transforms XeGPUPeepHoleOptimizer.cpp

[MLIR][XeGPU] Clean up stale convert_layout on single-element vector in peephole (#194043)

Extend MultiRed2dOpPattern in xegpu-optimize-peephole to also erase
consumer xegpu.convert_layout ops when a 2D vector.multi_reduction
produces a single-element vector (e.g. vector<1xf32>)
DeltaFile
+51-12mlir/lib/Dialect/XeGPU/Transforms/XeGPUPeepHoleOptimizer.cpp
+49-4mlir/test/Dialect/XeGPU/peephole-optimize.mlir
+2-1mlir/include/mlir/Dialect/XeGPU/IR/XeGPUAttrs.td
+102-173 files

LLVM/project b6824bfllvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVISelLowering.h

[RISCV] Remove codegen for vp_and, vp_or, vp_xor, vp_sra, vp_srl, vp_shl. NFC (#194904)
DeltaFile
+2-50llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+0-1llvm/lib/Target/RISCV/RISCVISelLowering.h
+2-512 files

LLVM/project 326162bllvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator ret-aggregates.ll args-simd.ll

[WebAssembly][GlobalISel] Remove unecessary `-verify-machineinstrs` from tests (NFC) (#194799)

Removes all uses of `-verify-machineinstrs` from the Wasm GISel tests.

This only impacts `*.ll` in practice, as `-verify-machineinstrs` appears
to be implicitly enabled when processing `.mir` files.
DeltaFile
+4-4llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/ret-aggregates.ll
+3-3llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args-simd.ll
+2-2llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args.ll
+2-2llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/ret-basics.ll
+2-2llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/args-swiftcc.ll
+2-2llvm/test/CodeGen/WebAssembly/GlobalISel/irtranslator/ret-simd.ll
+15-1511 files not shown
+26-2617 files

LLVM/project 88d2615lldb/packages/Python/lldbsuite/test decorators.py, lldb/packages/Python/lldbsuite/test/tools/lldb-server gdbremote_testcase.py

[lldb] Add skipIfWasm decorator and skip unsupported WebAssembly tests (#194761)

Add a new `skipIfWasm` test decorator that skips tests on the "wasip1"
and "wasi" platforms, and apply it to the test classes that rely on
expression evaluation or lldb-server, neither of which is available when
debugging WebAssembly targets.
DeltaFile
+5-0lldb/packages/Python/lldbsuite/test/decorators.py
+2-1lldb/test/API/lang/cpp/typedef/TestCppTypedef.py
+2-0lldb/test/API/lang/cpp/covariant-return-types/TestCovariantReturnTypes.py
+2-0lldb/test/API/commands/target/stop-hooks/TestStopHooks.py
+2-0lldb/test/API/commands/expression/context-object/TestContextObject.py
+2-0lldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py
+15-150 files not shown
+68-156 files

LLVM/project a7a4a8alldb/packages/Python/lldbsuite/test decorators.py, lldb/test/API/commands/target/create-deps TestTargetCreateDeps.py

[lldb] Decorate tests that use shared libraries (#193118)

`wasip1` does not support shared libraries in the traditional POSIX
sense. It was designed primarily as a monolithic system interface for
standalone modules where everything is statically linked. `wasip2`
introduced a "component model" where components achieve the goals of
shared libraries.
DeltaFile
+9-0lldb/packages/Python/lldbsuite/test/decorators.py
+5-0lldb/test/API/commands/target/create-deps/TestTargetCreateDeps.py
+2-0lldb/test/API/functionalities/breakpoint/move_nearest/TestMoveNearest.py
+2-0lldb/test/API/lang/cpp/step-through-trampoline/TestStepThroughTrampoline.py
+2-0lldb/test/API/functionalities/completion/TestCompletion.py
+1-0lldb/test/API/functionalities/executable_first/TestExecutableFirst.py
+21-036 files not shown
+57-042 files

LLVM/project a8f9baellvm/lib/Target/AMDGPU SIProgramInfo.cpp

[NFC] typo fixes (#194908)
DeltaFile
+3-3llvm/lib/Target/AMDGPU/SIProgramInfo.cpp
+3-31 files

LLVM/project 202adc3llvm/include/llvm/Support DebugCounter.h, llvm/unittests/Support DebugCounterTest.cpp

[Support] Introduce a function to reset all debug counters (#194864)

This PR adds a function to reset all debug counters, and extends the
unit test to verify that the debug counters are reset as expected. This
is required for running tools repeatedly in the same process.
DeltaFile
+13-0llvm/include/llvm/Support/DebugCounter.h
+13-0llvm/unittests/Support/DebugCounterTest.cpp
+26-02 files

LLVM/project 4acbff3clang/lib/CodeGen CGCleanup.cpp CGExprCXX.cpp, clang/test/CodeGen windows-seh-EHa-TryInFinally.cpp

[WinEH] Fix crash when deleting C++ objects inside SEH __try (#180144)

Introduce a dedicated cleanup flag for SEH __finally blocks and use it
to separate SEH try cleanup emission from C++ object cleanup emission

This prevents __finally cleanups from emitting seh.scope.begin/end and
keeps destructor/delete cleanups paired with seh.scope markers

Fix #109576
DeltaFile
+117-6clang/test/CodeGen/windows-seh-EHa-TryInFinally.cpp
+15-10clang/lib/CodeGen/CGCleanup.cpp
+25-0clang/test/CodeGenCXX/exceptions-seh.cpp
+12-0clang/lib/CodeGen/CGExprCXX.cpp
+7-0clang/lib/CodeGen/CGCleanup.h
+4-2clang/lib/CodeGen/CGException.cpp
+180-181 files not shown
+183-187 files

LLVM/project 4d7c1c6llvm/cmake config-ix.cmake, llvm/cmake/modules FindLibXml2.cmake

[cmake] Fix find libxml2 for Windows static libraries (#194894)

* Add the usual Windows static library name "libxml2s"
* Windows build with static libxml2 requires compiler define
DeltaFile
+3-0llvm/cmake/modules/FindLibXml2.cmake
+1-0llvm/cmake/config-ix.cmake
+4-02 files

LLVM/project 5193332clang/lib/CIR/CodeGen CIRGenExprScalar.cpp, clang/test/CIR/CodeGen vector-ext.cpp vector.cpp

[CIR] Ternary expression for VectorType with vector cond (#194128)

Support Ternary expression for VectorType with vector condition

Issue #192311
DeltaFile
+42-0clang/test/CIR/CodeGen/vector-ext.cpp
+42-0clang/test/CIR/CodeGen/vector.cpp
+3-3clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+87-33 files

LLVM/project 1242f93llvm/docs LangRef.rst TransformMetadata.rst, llvm/lib/Transforms/Scalar WarnMissedTransforms.cpp

Revert "[LoopVectorize] Add metadata to distinguish vectorized loop body from scalar remainder (#190258)" (#194901)

Reverts llvm/llvm-project#190258

This commit is causing crashes on the `intel-sycl-gpu` buildbot:
https://lab.llvm.org/buildbot/#/builders/225/builds/7157

The crash is a SEGFAULT in
`LoopVectorizationPlanner::updateLoopMetadataAndProfileInfo` when
optimization remarks are enabled
(`-pass-remarks-analysis=loop-vectorize`). Reverting while investigating
the root cause.
DeltaFile
+0-120llvm/test/Transforms/LoopTransformWarning/vectorizer-loop-kind-unroll-warning.ll
+0-110llvm/test/Transforms/LoopUnroll/vectorizer-loop-kind-remarks.ll
+0-42llvm/docs/LangRef.rst
+0-21llvm/docs/TransformMetadata.rst
+3-10llvm/lib/Transforms/Scalar/WarnMissedTransforms.cpp
+0-12llvm/lib/Transforms/Utils/LoopUtils.cpp
+3-3155 files not shown
+9-34611 files

LLVM/project 27b8441llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[SLP][NFC]Fix building on windows, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194903
DeltaFile
+4-4llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+4-41 files

LLVM/project ff8abfallvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv fixed-vectors-trunc-vp.ll vtrunc-vp.ll

[RISCV] Remove codegen for vp_trunc (#194886)

Part of the work to remove trivial VP intrinsics from the RISC-V
backend, see
https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999

This splits off vp_truncate from #179622.
DeltaFile
+148-687llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp.ll
+59-156llvm/test/CodeGen/RISCV/rvv/vtrunc-vp.ll
+8-128llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+23-26llvm/test/CodeGen/RISCV/rvv/vp-vaaddu.ll
+12-13llvm/test/CodeGen/RISCV/rvv/vtrunc-vp-mask.ll
+12-12llvm/test/CodeGen/RISCV/rvv/fixed-vectors-trunc-vp-mask.ll
+262-1,0227 files not shown
+283-1,06313 files

LLVM/project 6a82589clang/unittests/Lex ModuleMapTest.cpp

Remove an unused include which causes Bazel dep-check failure (#194902)
DeltaFile
+0-1clang/unittests/Lex/ModuleMapTest.cpp
+0-11 files

LLVM/project e6d46f1llvm/test/Transforms/LoopVectorize uniform_across_vf_induction2.ll uniform_across_vf_induction1_lshr.ll, llvm/test/Transforms/LoopVectorize/AArch64 scalable-strict-fadd.ll

[VPlan] Expand DerivedIV into executable recipes (#187589)

This allows us to strip DerivedIVRecipe::execute, and remove the
dependency on emitTransformedIndex. It allows us to benefit from
existing simplifications in VPlan.
DeltaFile
+636-636llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction2.ll
+288-288llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_lshr.ll
+228-228llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1.ll
+115-115llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction1_and.ll
+66-68llvm/test/Transforms/LoopVectorize/X86/interleaved-accesses-hoist-load-across-store.ll
+58-58llvm/test/Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll
+1,391-1,39399 files not shown
+1,978-2,126105 files

LLVM/project e784c7dllvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV rvp-simd-64.ll

[RISCV] Fix crashes and add RV32 RUN line to rvp-simd-64.ll (#194782)

Prevent combinePExtTruncate from forming RISCVISD nodes with illegal
type. Remove unnecessary call to getSimpleVT().

Legalize shift amount when custom legalizing i64 shifts.
SelectionDAGBuilder usually pre-legalizes shift amounts. If we scalarize
a vXi64 vector shift the shift amount will be i64.
DeltaFile
+3,250-1,325llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+12-4llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+3,262-1,3292 files

LLVM/project 1230cfdllvm/test/CodeGen/AMDGPU ptr-arg-dbg-value.ll, llvm/test/CodeGen/BPF/CORE offset-reloc-basic.ll

[AMDGPU] Propagate debug info to constant materialization instr (#192669)

Set the debug location on non-target constant nodes so that the
resulting machine instructions inherit the correct source location.
DeltaFile
+5-4llvm/test/DebugInfo/COFF/jump-table-with-indirect-ptr-null.ll
+4-4llvm/test/tools/llvm-objdump/ELF/AMDGPU/source-lines.ll
+3-3llvm/test/CodeGen/AMDGPU/ptr-arg-dbg-value.ll
+2-2llvm/test/CodeGen/BPF/CORE/offset-reloc-basic.ll
+1-1llvm/test/DebugInfo/AMDGPU/debug-loc-copy.ll
+1-1llvm/test/DebugInfo/ARM/single-constant-use-preserves-dbgloc.ll
+16-151 files not shown
+18-157 files

LLVM/project 70a26a2llvm/docs LangRef.rst TransformMetadata.rst, llvm/lib/Transforms/Scalar WarnMissedTransforms.cpp

[LoopVectorize] Add metadata to distinguish vectorized loop body from scalar remainder (#190258)

Add two new loop metadata attributes — `llvm.loop.vectorize.body` and
`llvm.loop.vectorize.epilogue` — that the loop vectorizer sets on the
generated vector loop and epilogue loop respectively. The metadata is
only emitted when optimization remarks are enabled (`ORE->enabled()`),
so it has zero cost in normal compilation.

These enable downstream passes (LoopUnroll, WarnMissedTransforms) to
produce more precise optimization remarks. Instead of the generic "loop
not unrolled" warning on a source line that was vectorized, the unroller
can now report:
- **"vectorized loop"** for the main vector body
- **"epilogue loop"** for the scalar epilogue/remainder
- **"epilogue vectorized loop"** for an epilogue that was itself
vectorized during epilogue vectorization (carries both attributes)

A shared `getLoopVectorizeKindPrefix()` helper in
`LoopUtils.h`/`LoopUtils.cpp` reads the metadata and returns the

    [17 lines not shown]
DeltaFile
+120-0llvm/test/Transforms/LoopTransformWarning/vectorizer-loop-kind-unroll-warning.ll
+110-0llvm/test/Transforms/LoopUnroll/vectorizer-loop-kind-remarks.ll
+42-0llvm/docs/LangRef.rst
+21-0llvm/docs/TransformMetadata.rst
+10-3llvm/lib/Transforms/Scalar/WarnMissedTransforms.cpp
+12-0llvm/lib/Transforms/Utils/LoopUtils.cpp
+315-35 files not shown
+346-911 files

LLVM/project 63c0520llvm/lib/Target/RISCV RISCVSchedSiFive7.td, llvm/test/CodeGen/RISCV short-forward-branch-opt.ll sifive7-enable-intervals.mir

[RISCV] Use BufferSize = 0 for ProcResGroup in SiFive7 scheduling models (#194754)

As it turns out, even if a `ProcResGroup` consists of in-order pipes, as
long as its (the group's) BufferSize is not zero, Machine Scheduler will
not use in-order scheduling on instructions that consume it. Since
BufferSize also defaults to -1 for `ProcResGroup`, we have been
scheduling the resource consumption of SiFive7's `PipeAB` (scalar pipes)
and `VA1OrVA2` (vector pipes) in an out-of-order fashion!

Co-authored-by: Min Hsu <min.hsu at sifive.com>
DeltaFile
+18-18llvm/test/CodeGen/RISCV/short-forward-branch-opt.ll
+7-7llvm/lib/Target/RISCV/RISCVSchedSiFive7.td
+3-3llvm/test/CodeGen/RISCV/sifive7-enable-intervals.mir
+28-283 files

LLVM/project 2be72edllvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[SLP][NFC] Reduce compile time of isTreeTinyAndNotFullyVectorizable

Cache root entry and SLPCostThreshold queries once, group
!ForReduction-only checks under two blocks, extract a shared benign-node
predicate from the two duplicated lambdas, and skip HasSingleLoad and
allConstant work when results are dead.

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194895
DeltaFile
+268-198llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+268-1981 files

LLVM/project b46904aclang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-perm.c

[CIR][AArch64] Lower NEON vzip intrinsics (#193658)

### Summary

part of https://github.com/llvm/llvm-project/issues/185382

lower part of intrinsics in :
https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#zip-elements

Lower NEON::BI__builtin_neon_vzip_v and NEON::BI__builtin_neon_vzipq_v
in CIRGenBuiltinAArch64.cpp by porting the existing incubator logic
(`clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp`) onto ClangIR:
two bitcasts on the input vectors, two rounds of cir.vec.shuffle
generating the low/high interleave patterns, each stored through a
ptr_stride of the sret base pointer.

### Test
- test_vzip_mf8
- test_vzipq_mf8

    [11 lines not shown]
DeltaFile
+0-376clang/test/CodeGen/AArch64/neon-perm.c
+372-0clang/test/CodeGen/AArch64/neon/perm.c
+0-36clang/test/CodeGen/AArch64/fp8-intrinsics/acle_neon_fp8_untyped.c
+24-1clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+396-4134 files

LLVM/project f11ad99libcxx/test/selftest/dsl dsl.sh.py, libcxx/utils/libcxx/test/features localization.py

[libcxx][lit] Fixing libcxx test failures on Windows (#194752)

PR#194368 changed how line breaks are handles on Windows and it broke
several libcxx tests on Windows, including
libcxx/test/std/localization/locale.categories/facet.numpunct/
locale.numpunct.byname/thousands_sep.pass.cpp
This patch addresses this issue.
DeltaFile
+5-3libcxx/test/selftest/dsl/dsl.sh.py
+1-1libcxx/utils/libcxx/test/features/localization.py
+6-42 files

LLVM/project 507caafllvm/lib/Target/X86 X86ISelLowering.cpp, llvm/test/CodeGen/X86 vector-extract-last-active.ll vector-reduce-smin.ll

[X86] Add custom ISD::VEC_REDUCE_*MIN/MAX lowering (#194848)

Pulled out of #194473 - update combineMinMaxReduction to fold to a
ISD::VECREDUCE_SMAX/SMIN/UMAX/UMIN node and then perform the lowering
later on.

combineMinMaxReduction will go away once we can use
shouldExpandReduction, rely on the middle-end to recognise reductions
and not have to recreate them from the expanded patterns.

I've added pre-SSE41 handling using vector unrolling - hopefully this
will go away once #194672 is in place.
DeltaFile
+109-243llvm/test/CodeGen/X86/vector-extract-last-active.ll
+118-56llvm/lib/Target/X86/X86ISelLowering.cpp
+39-39llvm/test/CodeGen/X86/vector-reduce-smin.ll
+22-44llvm/test/CodeGen/X86/intrinsic-cttz-elts.ll
+28-28llvm/test/CodeGen/X86/vector-reduce-smax.ll
+25-25llvm/test/CodeGen/X86/vector-reduce-umin.ll
+341-4351 files not shown
+355-4497 files

LLVM/project f933bbfllvm/test/TableGen directive2.td directive1.td, llvm/utils/TableGen/Basic DirectiveEmitter.cpp

[TableGen] Use guarded assert in constexpr functions (#194728)

The constexpr functions in question take a scoped enum as an argument
and a switch statement returns a value for each value of the enum. These
are all legal statements in a constexpr function in C++14.

Under constexpr rules, the evaluation of a constexpr function cannot
lead to an evaluation of any prohibited forms of expressions. An
evaluation of the functions being discussed with a valid argument will
terminate at the switch, and an code that follows will not be evaluated.

Using "llvm_unreachable" after the switch should be ok as long as the
expansion of the llvm_unreachable macro does not contain any statements
not allowed to appear in a constexpr function. At the same time, GCC
before v9 did not tolerate any unguarded calls to non-constexpr
functions after the switch.

To avoid using "llvm_unreachable", which can have multiple expansions,
use an assert with an explicit condition that the underlying value of
the argument lies between the minimum and maximum values of the enum.
DeltaFile
+16-12llvm/utils/TableGen/Basic/DirectiveEmitter.cpp
+10-12llvm/test/TableGen/directive2.td
+10-12llvm/test/TableGen/directive1.td
+36-363 files