[SystemZ] Add indirect reference bit XATTR REFERENCE(INDIRECT) for indirect symbol handling support (#183441)
This is the first of three patches aimed to support indirect symbol
handling for the SystemZ backend. This PR introduces a `GOFF:ERAttr` to
represent indirect references, handles indirect symbols within
`setSymbolAttribute()` by setting the indirect reference bit, and also
updates the HLASM streamer to emit `XATTR REFERENCE(INDIRECT)` and
various other combinations.
[clang][DebugInfo] Rename _vtable$ to __clang_vtable (#183617)
Discussion is a follow-up from
https://github.com/llvm/llvm-project/issues/182762#issuecomment-3965207289
(where we're discussing how LLDB could make use of this symbol for
vtable detection).
`_vtable$` is not a reserved identifier in C or C++. In order for
debuggers to reliably use this symbol without accidentally reaching into
user-identifiers, this patch renames it such that it is reserved. The
naming follows the style of the recently added `__clang_trap_msg`
debug-info symbol.
[SPIRV][NFCI] Use unordered data structures for SPIR-V extensions (#183567)
Review follow-up from https://github.com/llvm/llvm-project/pull/183325
No reason for these data structures to be ordered.
Minor annoyance when trying to use `DenseMap` because of the C++ code
for enums generated by TableGen, but not too bad.
Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
[SCEV] Introduce SCEVUse wrapper type (NFC)
Add SCEVUse as a PointerIntPair wrapper around const SCEV * to prepare
for storing additional per-use information.
This commit contains the mechanical changes of adding an intial SCEVUse
wrapper and updating all relevant interfaces to take SCEVUse. Note that
currently the integer part is never set, and all SCEVUses are
considered canonical.
[lldb][Process/FreeBSDKernelCore] Implement DoWriteMemory() (#183553)
Implement `ProcessFreeBSDKernelCore::DoWriteMemory()` to write data on
kernel crash dump or `/dev/mem`. Due to safety reasons (e.g. writing
wrong value on `/dev/mem` can trigger kernel panic), this feature is
only enabled when `plugin.process.freebsd-kernel-core.read-only` is set
to false (true by default).
Since 85a1fe6 (#183237) was reverted as it was prematurely merged, I'm
committing changes again with corrections here.
---------
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
[mlir][transforms] Fix crash in remove-dead-values when function has non-call users (#183655)
`processFuncOp` asserts that all symbol uses of a function are
`CallOpInterface` operations. This is violated when a function is
referenced by a non-call operation such as `spirv.EntryPoint`, which
uses the function symbol for metadata purposes without calling it.
Fix this by replacing the assertion with an early return: if any user of
the function symbol is not a `CallOpInterface`, skip the function
entirely. This is safe because the pass cannot determine the semantics
of arbitrary non-call references, so it should leave such functions
alone.
Fixes #180416
[mlir][tensor] Fix crash in tensor.from_elements fold with non-scalar element types (#183659)
The fold for tensor.from_elements attempted to always produce a
DenseElementsAttr by calling DenseElementsAttr::get(type, elements).
However, DenseElementsAttr::get only handles basic scalar element types
(integer, index, float, complex) directly. For other element types such
as vector types, it expects StringAttr (raw bytes) for each element,
which folded constants won't provide — triggering an assertion.
Fix this by guarding the fold: only attempt the DenseElementsAttr fold
when the tensor element type is integer, index, float, or complex.
Fixes #180459
[SelectionDAG] Fix CLMULR/CLMULH expansion (#183537)
For v8i8 on AArch64, `expandCLMUL` picked the zext path (ExtVT=v8i16) since ZERO_EXTEND/SRL were legal, but CLMUL on v8i16 is not, resulting in a bit-by-bit expansion (~42 insns). Prefer the bitreverse path when CLMUL is legal on VT but not ExtVT.
v8i8 CLMULR: 42 → 4 instructions.
Fixes #182780
[MLIR][Vector] Enhance shape_cast unrolling support in case the target shape is [1, 1, ..1] (#183436)
This PR fixes a minor issue in shape_cast unrolling: when all target
dimensions are unit-sized, it no longer removes all leading unit
dimensions.
[MIR] Error on signed integer in getUnsigned (#183171)
Previously we effectively took the absolute value of the APSInt, instead
diagnose the unexpected negative value.
Change-Id: I4efe961e7b29fdf1d5f97df12f8139aac12c9219
[AMDGPU][Scheduler] Add `GCNRegPressure`-based methods to `GCNRPTarget` (#182853)
This adds a few methods to `GCNRPTarget` that can estimate/perform RP
savings based on `GCNRegPressure` instead of a single `Register`,
opening the door to model/incorporate more complex savings made up of
multiple registers of potentially different classes. The scheduler's
rematerialization stage now uses this new API.
Although there are no test changes this is not really NFC since register
pressure savings in the rematerialization stage are now computed through
`GCNRegPressure` instead of the stage itself. If anything this makes
them more consistent with the rest of the RP-tracking infrastructure.
[LLVM][Runtimes] Add 'llvm-gpu-loader' to dependency list (#183601)
Summary:
This is used to run the unit tests in libc / libc++. It must exist in
the build directory's binary path, but without this dependnecy we may
not build it before running the runtimes build. This should ensure that
it's present, and only if we have tests enabled.
[X86] Fold XOR of two vgf2p8affineqb instructions with same input (#179900)
This patch implements an optimization to fold XOR of two
`vgf2p8affineqb` instructions operating on the same input.
This optimization:
Reduces instruction count from 3 to 2
Eliminates one vgf2p8affineqb instruction
Added combineXorWithTwoGF2P8AFFINEQB function in X86ISelLowering.cpp
Uses sd_match pattern matching for consistency with existing code
Checks that both operations have single use to avoid code bloat
Verifies both operations use the same input
Handles commutative XOR patterns automatically