[VPlan] Remove manual region removal when simplifying for VF and UF. (#181252)
Replace manual region dissolution code in
simplifyBranchConditionForVFAndUF with using general
removeBranchOnConst. simplifyBranchConditionForVFAndUF now just creates
a (BranchOnCond true) or updates BranchOnTwoConds.
The loop then gets automatically removed by running removeBranchOnConst.
This removes a bunch of special logic to handle header phi replacements
and CFG updates. With the new code, there's no restriction on what kind
of header phi recipes the loop contains.
Note that VPEVLBasedIVRecipe needs to be marked as readnone. This is
technically unrelated, but I could not find an independent test that
would be impacted.
The code to deal with epilogue resume values now needs updating, because
we may simplify a reduction directly to the start value.
PR: https://github.com/llvm/llvm-project/pull/181252
[clang] stop error recovery in SFINAE for narrowing in converted constant expressions
A narrowing conversion in a converted constant expression should produce an
invalid expression so that [temp.deduct.general]p7 is satisfied, by stopping
substitution at this point.
Fixes #167709
[pdb] Fix libc++ strict-weak-ordering assertion failures from gsiRecordCmp (#183749)
Builds using libc++ hardening was hitting asserts like
libc++ Hardening assertion
!__comp(*(__first + __a), *(__first + __b)) failed:
Your comparator is not a valid strict-weak ordering
printf-debugging revealed that symbols like "?ST@@3JA" were not
comparing equal with themselves. It turns out the comparison was done
with
return S1.compare_insensitive(S2.data());
and even when &S1 == &S2, S1 and S2.data() may not refer to identical
strings, since data() may not have a null terminator where the StringRef
locally ends.
This fixes the ordering, simplifies the code, and makes it a little
[2 lines not shown]
[mlir][vector] Rename `ReduceMultiDimReductionRank` -> `FlattenMultiReduction` (NFC) (#183721)
The updated name better captures what the pattern does and matches the
coresponding `populat*` hook,
`populateVectorMultiReductionFlatteningPatterns`, that only contains
this pattern.
[SystemZ] Add indirect reference bit XATTR REFERENCE(INDIRECT) for indirect symbol handling support (#183441)
This is the first of three patches aimed to support indirect symbol
handling for the SystemZ backend. This PR introduces a `GOFF:ERAttr` to
represent indirect references, handles indirect symbols within
`setSymbolAttribute()` by setting the indirect reference bit, and also
updates the HLASM streamer to emit `XATTR REFERENCE(INDIRECT)` and
various other combinations.
[clang][DebugInfo] Rename _vtable$ to __clang_vtable (#183617)
Discussion is a follow-up from
https://github.com/llvm/llvm-project/issues/182762#issuecomment-3965207289
(where we're discussing how LLDB could make use of this symbol for
vtable detection).
`_vtable$` is not a reserved identifier in C or C++. In order for
debuggers to reliably use this symbol without accidentally reaching into
user-identifiers, this patch renames it such that it is reserved. The
naming follows the style of the recently added `__clang_trap_msg`
debug-info symbol.
[SPIRV][NFCI] Use unordered data structures for SPIR-V extensions (#183567)
Review follow-up from https://github.com/llvm/llvm-project/pull/183325
No reason for these data structures to be ordered.
Minor annoyance when trying to use `DenseMap` because of the C++ code
for enums generated by TableGen, but not too bad.
Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
[SCEV] Introduce SCEVUse wrapper type (NFC)
Add SCEVUse as a PointerIntPair wrapper around const SCEV * to prepare
for storing additional per-use information.
This commit contains the mechanical changes of adding an intial SCEVUse
wrapper and updating all relevant interfaces to take SCEVUse. Note that
currently the integer part is never set, and all SCEVUses are
considered canonical.
[lldb][Process/FreeBSDKernelCore] Implement DoWriteMemory() (#183553)
Implement `ProcessFreeBSDKernelCore::DoWriteMemory()` to write data on
kernel crash dump or `/dev/mem`. Due to safety reasons (e.g. writing
wrong value on `/dev/mem` can trigger kernel panic), this feature is
only enabled when `plugin.process.freebsd-kernel-core.read-only` is set
to false (true by default).
Since 85a1fe6 (#183237) was reverted as it was prematurely merged, I'm
committing changes again with corrections here.
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[mlir][transforms] Fix crash in remove-dead-values when function has non-call users (#183655)
`processFuncOp` asserts that all symbol uses of a function are
`CallOpInterface` operations. This is violated when a function is
referenced by a non-call operation such as `spirv.EntryPoint`, which
uses the function symbol for metadata purposes without calling it.
Fix this by replacing the assertion with an early return: if any user of
the function symbol is not a `CallOpInterface`, skip the function
entirely. This is safe because the pass cannot determine the semantics
of arbitrary non-call references, so it should leave such functions
alone.
Fixes #180416
[mlir][tensor] Fix crash in tensor.from_elements fold with non-scalar element types (#183659)
The fold for tensor.from_elements attempted to always produce a
DenseElementsAttr by calling DenseElementsAttr::get(type, elements).
However, DenseElementsAttr::get only handles basic scalar element types
(integer, index, float, complex) directly. For other element types such
as vector types, it expects StringAttr (raw bytes) for each element,
which folded constants won't provide — triggering an assertion.
Fix this by guarding the fold: only attempt the DenseElementsAttr fold
when the tensor element type is integer, index, float, or complex.
Fixes #180459
[SelectionDAG] Fix CLMULR/CLMULH expansion (#183537)
For v8i8 on AArch64, `expandCLMUL` picked the zext path (ExtVT=v8i16) since ZERO_EXTEND/SRL were legal, but CLMUL on v8i16 is not, resulting in a bit-by-bit expansion (~42 insns). Prefer the bitreverse path when CLMUL is legal on VT but not ExtVT.
v8i8 CLMULR: 42 → 4 instructions.
Fixes #182780
[MLIR][Vector] Enhance shape_cast unrolling support in case the target shape is [1, 1, ..1] (#183436)
This PR fixes a minor issue in shape_cast unrolling: when all target
dimensions are unit-sized, it no longer removes all leading unit
dimensions.
[MIR] Error on signed integer in getUnsigned (#183171)
Previously we effectively took the absolute value of the APSInt, instead
diagnose the unexpected negative value.
Change-Id: I4efe961e7b29fdf1d5f97df12f8139aac12c9219
[AMDGPU][Scheduler] Add `GCNRegPressure`-based methods to `GCNRPTarget` (#182853)
This adds a few methods to `GCNRPTarget` that can estimate/perform RP
savings based on `GCNRegPressure` instead of a single `Register`,
opening the door to model/incorporate more complex savings made up of
multiple registers of potentially different classes. The scheduler's
rematerialization stage now uses this new API.
Although there are no test changes this is not really NFC since register
pressure savings in the rematerialization stage are now computed through
`GCNRegPressure` instead of the stage itself. If anything this makes
them more consistent with the rest of the RP-tracking infrastructure.
[LLVM][Runtimes] Add 'llvm-gpu-loader' to dependency list (#183601)
Summary:
This is used to run the unit tests in libc / libc++. It must exist in
the build directory's binary path, but without this dependnecy we may
not build it before running the runtimes build. This should ensure that
it's present, and only if we have tests enabled.