[flang][HLFIR] Fix crash in WHERE with exactly_once inside elemental (#194443)
Fix a segfault in LowerHLFIROrderedAssignments when compiling a WHERE
statement whose mask contains an array constructor with an implied-do
loop (e.g. WHERE([(f(J), J=1,N)]) ...). The hlfir.exactly_once op inside
the hlfir.elemental has live-in values that are block arguments from the
enclosing elemental, which canonicalizeExactlyOnceInsideWhere cannot
pull into the exactly_once region.
The fix has two parts:
1. In canonicalizeExactlyOnceInsideWhere, skip exactly_once ops nested
inside hlfir.elemental and skip block argument live-ins, since these
cannot be relocated.
2. In both overloads of inlineElementalOp, handle hlfir.exactly_once by
inlining its body and cleanup operations instead of cloning the op
verbatim (which left an illegal op after lowering).
[6 lines not shown]
[lldb] Handle ConstantExpr constants in InjectPointerSigningFixupCode (#194476)
Currently, the injection code assumes we encounter ConstantAggregate
constants and emits a GEP to access the fields/members. However, it's
possible for a CPA to be an operand of a ConstantExpr (e.g. a bitcast).
Emitting a GEP in that scenario doesn't make sense. This should instead
be handled by keeping track of the path to the CPA (which operands need
to be followed from the top-level ConstantExpr).
After this change, most arm64e tests that crash LLDB either pass or fail
with some other issue. The main exception is TestWeakSymbols.py which
needs more work.
[MLIR][XeGPU] Clean up stale convert_layout on single-element vector in peephole (#194043)
Extend MultiRed2dOpPattern in xegpu-optimize-peephole to also erase
consumer xegpu.convert_layout ops when a 2D vector.multi_reduction
produces a single-element vector (e.g. vector<1xf32>)
[WebAssembly][GlobalISel] Remove unecessary `-verify-machineinstrs` from tests (NFC) (#194799)
Removes all uses of `-verify-machineinstrs` from the Wasm GISel tests.
This only impacts `*.ll` in practice, as `-verify-machineinstrs` appears
to be implicitly enabled when processing `.mir` files.
[lldb] Add skipIfWasm decorator and skip unsupported WebAssembly tests (#194761)
Add a new `skipIfWasm` test decorator that skips tests on the "wasip1"
and "wasi" platforms, and apply it to the test classes that rely on
expression evaluation or lldb-server, neither of which is available when
debugging WebAssembly targets.
[lldb] Decorate tests that use shared libraries (#193118)
`wasip1` does not support shared libraries in the traditional POSIX
sense. It was designed primarily as a monolithic system interface for
standalone modules where everything is statically linked. `wasip2`
introduced a "component model" where components achieve the goals of
shared libraries.
[Support] Introduce a function to reset all debug counters (#194864)
This PR adds a function to reset all debug counters, and extends the
unit test to verify that the debug counters are reset as expected. This
is required for running tools repeatedly in the same process.
[WinEH] Fix crash when deleting C++ objects inside SEH __try (#180144)
Introduce a dedicated cleanup flag for SEH __finally blocks and use it
to separate SEH try cleanup emission from C++ object cleanup emission
This prevents __finally cleanups from emitting seh.scope.begin/end and
keeps destructor/delete cleanups paired with seh.scope markers
Fix #109576
[cmake] Fix find libxml2 for Windows static libraries (#194894)
* Add the usual Windows static library name "libxml2s"
* Windows build with static libxml2 requires compiler define
Revert "[LoopVectorize] Add metadata to distinguish vectorized loop body from scalar remainder (#190258)" (#194901)
Reverts llvm/llvm-project#190258
This commit is causing crashes on the `intel-sycl-gpu` buildbot:
https://lab.llvm.org/buildbot/#/builders/225/builds/7157
The crash is a SEGFAULT in
`LoopVectorizationPlanner::updateLoopMetadataAndProfileInfo` when
optimization remarks are enabled
(`-pass-remarks-analysis=loop-vectorize`). Reverting while investigating
the root cause.
[VPlan] Expand DerivedIV into executable recipes (#187589)
This allows us to strip DerivedIVRecipe::execute, and remove the
dependency on emitTransformedIndex. It allows us to benefit from
existing simplifications in VPlan.
[RISCV] Fix crashes and add RV32 RUN line to rvp-simd-64.ll (#194782)
Prevent combinePExtTruncate from forming RISCVISD nodes with illegal
type. Remove unnecessary call to getSimpleVT().
Legalize shift amount when custom legalizing i64 shifts.
SelectionDAGBuilder usually pre-legalizes shift amounts. If we scalarize
a vXi64 vector shift the shift amount will be i64.
[AMDGPU] Propagate debug info to constant materialization instr (#192669)
Set the debug location on non-target constant nodes so that the
resulting machine instructions inherit the correct source location.
[LoopVectorize] Add metadata to distinguish vectorized loop body from scalar remainder (#190258)
Add two new loop metadata attributes — `llvm.loop.vectorize.body` and
`llvm.loop.vectorize.epilogue` — that the loop vectorizer sets on the
generated vector loop and epilogue loop respectively. The metadata is
only emitted when optimization remarks are enabled (`ORE->enabled()`),
so it has zero cost in normal compilation.
These enable downstream passes (LoopUnroll, WarnMissedTransforms) to
produce more precise optimization remarks. Instead of the generic "loop
not unrolled" warning on a source line that was vectorized, the unroller
can now report:
- **"vectorized loop"** for the main vector body
- **"epilogue loop"** for the scalar epilogue/remainder
- **"epilogue vectorized loop"** for an epilogue that was itself
vectorized during epilogue vectorization (carries both attributes)
A shared `getLoopVectorizeKindPrefix()` helper in
`LoopUtils.h`/`LoopUtils.cpp` reads the metadata and returns the
[17 lines not shown]
[RISCV] Use BufferSize = 0 for ProcResGroup in SiFive7 scheduling models (#194754)
As it turns out, even if a `ProcResGroup` consists of in-order pipes, as
long as its (the group's) BufferSize is not zero, Machine Scheduler will
not use in-order scheduling on instructions that consume it. Since
BufferSize also defaults to -1 for `ProcResGroup`, we have been
scheduling the resource consumption of SiFive7's `PipeAB` (scalar pipes)
and `VA1OrVA2` (vector pipes) in an out-of-order fashion!
Co-authored-by: Min Hsu <min.hsu at sifive.com>
[SLP][NFC] Reduce compile time of isTreeTinyAndNotFullyVectorizable
Cache root entry and SLPCostThreshold queries once, group
!ForReduction-only checks under two blocks, extract a shared benign-node
predicate from the two duplicated lambdas, and skip HasSingleLoad and
allConstant work when results are dead.
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/194895
[CIR][AArch64] Lower NEON vzip intrinsics (#193658)
### Summary
part of https://github.com/llvm/llvm-project/issues/185382
lower part of intrinsics in :
https://arm-software.github.io/acle/neon_intrinsics/advsimd.html#zip-elements
Lower NEON::BI__builtin_neon_vzip_v and NEON::BI__builtin_neon_vzipq_v
in CIRGenBuiltinAArch64.cpp by porting the existing incubator logic
(`clangir/clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp`) onto ClangIR:
two bitcasts on the input vectors, two rounds of cir.vec.shuffle
generating the low/high interleave patterns, each stored through a
ptr_stride of the sret base pointer.
### Test
- test_vzip_mf8
- test_vzipq_mf8
[11 lines not shown]
[libcxx][lit] Fixing libcxx test failures on Windows (#194752)
PR#194368 changed how line breaks are handles on Windows and it broke
several libcxx tests on Windows, including
libcxx/test/std/localization/locale.categories/facet.numpunct/
locale.numpunct.byname/thousands_sep.pass.cpp
This patch addresses this issue.
[X86] Add custom ISD::VEC_REDUCE_*MIN/MAX lowering (#194848)
Pulled out of #194473 - update combineMinMaxReduction to fold to a
ISD::VECREDUCE_SMAX/SMIN/UMAX/UMIN node and then perform the lowering
later on.
combineMinMaxReduction will go away once we can use
shouldExpandReduction, rely on the middle-end to recognise reductions
and not have to recreate them from the expanded patterns.
I've added pre-SSE41 handling using vector unrolling - hopefully this
will go away once #194672 is in place.
[TableGen] Use guarded assert in constexpr functions (#194728)
The constexpr functions in question take a scoped enum as an argument
and a switch statement returns a value for each value of the enum. These
are all legal statements in a constexpr function in C++14.
Under constexpr rules, the evaluation of a constexpr function cannot
lead to an evaluation of any prohibited forms of expressions. An
evaluation of the functions being discussed with a valid argument will
terminate at the switch, and an code that follows will not be evaluated.
Using "llvm_unreachable" after the switch should be ok as long as the
expansion of the llvm_unreachable macro does not contain any statements
not allowed to appear in a constexpr function. At the same time, GCC
before v9 did not tolerate any unguarded calls to non-constexpr
functions after the switch.
To avoid using "llvm_unreachable", which can have multiple expansions,
use an assert with an explicit condition that the underlying value of
the argument lies between the minimum and maximum values of the enum.