[mlir][spirv] Remove ConstantLike trait from spirv.ARM.GraphConstant (#198054)
Operations with the `ConstantLike` trait can always be folded into a
concrete attribute value. However, the `spirv.ARM.GraphConstant` op
cannot be folded, because its GraphConstantID is merely a unique
identifier used to map to the actual constants defined in the SPIR-V
module. Therefore, the `ConstantLike` trait should be removed from
`pirv.ARM.GraphConstant`. Fixes #197970.
[llvm-ir2vec] Breaking up llvm-ir2vec lib implementation to clean up MIR deps from ir2vec python bindings (#194414)
The Python bindings only expose IR2Vec functionality. MIR2Vec has no
Python API. However, the single `LLVMEmbUtils` library bundled both
IR2VecTool and MIR2VecTool, causing CodeGen and Target components to be
linked into the nanobind module unnecessarily.
This patch splits the library along that boundary. LLVMIREmbUtils covers
IR2Vec and is linked by both the CLI tool and the Python bindings.
LLVMMIREmbUtils covers MIR2Vec and is linked only by the CLI tool.
Result: Python wheel size reduces from ~14 MB to ~4 MB.
[llvm-ir2vec] Setting up ir2vec python bindings testing for ml-opt bots (#194593)
- ~We are enabling IR2Vec Python binding tests in the LLVM monolithic
Linux CI by adding -D LLVM_IR2VEC_ENABLE_PYTHON_BINDINGS=ON to
monolithic-linux.sh.~
- We're adding testing for ir2vec python bindings with the ml-opt
buildbots. To that end, we need to add pip install requirements, and
other relevant flags to make way for a seamless warning-free llvm build.
The following changes are being done here
- Adding a requirements.txt file, putting out an explicit nanobind
requirement.
- Adding the option for downstream users to test bindings as part of the
`check llvm` umbrella, by passing the appropriate bindings flag
- Suppressing warnings from the nanobind headers, in order to ensure a
seamless llvm cI build
[clang-tidy] Fix false positives about reinitialization detection in `bugprone-use-after-move` (#197438)
When calling base class's `operator=` through derived object, a implicit
cast with `UncheckedDerivedToBase` will be generated:
```
void foo() {
Base b;
Derived d;
std::move(d);
d = b;
}
```
AST for `d = b`'s `d`:
```
|-ImplicitCastExpr <col:3> 'GH62206::Base' lvalue <UncheckedDerivedToBase (Base)>
| `-DeclRefExpr <col:3> 'Derived' lvalue Var 0x1d11a400 'd' 'Derived'
```
This patch considers possible `implicitCastExpr` in the reinit matcher,
[8 lines not shown]
[PHIElimination] Clear stale LiveVariables AliveBlocks for undef PHI sources (#197764)
When PHI Elimination lowers a PHI with an undef source (e.g. from an
`IMPLICIT_DEF),` it skips the LiveVariables kill/AliveBlocks update
because the value is undefined. However, the source register's
AliveBlocks may still mark intermediate blocks as live-through from its
definition to the (now eliminated) PHI use. This causes MachineVerifier
failures in EXPENSIVE_CHECKS builds.
Fix by calling `recomputeForSingleDefVirtReg` on undef source registers
when their last PHI use on a CFG edge is eliminated, which correctly
clears the stale AliveBlocks entries.
Fixes the EXPENSIVE_CHECKS failure introduced by #196895.
[OpenACC] Fix invalid using inside of an openacc directive (#198058)
Bug report #197858 comes up with a reproducer where an invalid `using`
declaration checks the Scope it is in, and asserts if it isn't in a
DeclScope. Since all of the important directives that create scopes end
up causing a new scope anyway, this patch adds 'DeclScope' to the parse
scope for an OpenACC directive. This follows the guidance of the OpenMP
directives.
Fixes: #197858
[clang][bytecode] Fix wrong 'never produces a constant expression' diagnostic with static data members (#197881)
They can be initialized later, similar to extern variables.
[libc] Make cpp::byte alias-safe (#194171)
Change LIBC_NAMESPACE::cpp::byte from an enum-backed type to unsigned
char so libc’s raw-memory utilities and sorting code can legally access
object representations without violating C++ strict-aliasing rules.
[MemoryBuiltins] Capture more information for alloc/free from attributes
We now read the `alloc_align` attribute to provide better alignment
information to users. `alloc-family` should be used as well, as
described in the LangRef. Two new helpers provide argument numbers,
rather than values.
[flang] Recognize effects on non-addressable resources in opt-bufferization.
opt-bufferization has been only handling `fir::DebuggingResource`
explicitly. This patch adds support for other non-addressable
resources, such as `fir::VolatileMemoryResource`. This allows
merging elemental/assign for the `volatile_src_nonvolatile_dst`
example in the updated LIT test.
[flang] Pass-through fir.volatile_cast in FIR AliasAnalysis.
It should be safe to pass-through `fir.volatile_cast` for the purpose
of alias analysis. The missing pass-through prevented optimization
of the `nonvolatile_src_volatile_dst` test (see updated LIT test).
[libc] Fix install-libc to work with LLVM_LIBC_FULL_BUILD=OFF (#197366)
Initialize variables that are conditionally set to avoid undefined
references in install-libc and install-libc-stripped targets:
- Initialize added_bitcode_targets to empty string (may be undefined
when LIBC_TARGET_OS_IS_GPU=OFF)
- Initialize startup_target to empty string and only set to
"libc-startup" when both LLVM_LIBC_FULL_BUILD=ON and NOT baremetal
(startup directory is only included in full builds)
- Initialize header_install_target to empty string (may be undefined
when LLVM_LIBC_FULL_BUILD=OFF)
[DirectX] Do not emit !dbg on function definitions (#197449)
This was not done in LLVM 3.7. Instead, the !DISubprogram contains a
reference to the function (already emitted).
[libc] Add config option to use memory builtin functions. (#197977)
Add a new CMake and C++ definition configuration option
`LIBC_CONF_USE_MEM_BUILTINS` to allow users to use compiler builtins for
memory utility functions (memcpy, memset, memmove, memcmp, and bcmp)
instead of LLVM libc's internal implementations. Main use-cases are:
- when users want to bring their own memory functions implementations
that are highly optimized for their targets
- improve portability by providing a fallback for targets for which LLVM
libc does not have memory utility implementations yet
- to be used for libc/shared functions and their testings, as we expect
libc/shared functions to provide their own memory functions.
[lldb] Fix data race in ObjectFile::GetSectionList (#197812)
The early `m_sections_up == nullptr` check was performed outside the
module mutex, so two threads sharing the same Module could both enter
the branch and race on the write in CreateSections. Restructure so the
check and populate both happen under the module mutex; this is a
standard double-checked locking fix.
Found by ThreadSanitizer as part of #197792.
PGO] Drop consecutive-zeros.ll test
pgo-memop-opt has previously validated VP metadata and bailed if it runs
into duplicate values in the VP metadata. VP metadata values will soon
be deduplicated at construction, making this no longer necessary, and
will also cause this test to fail, so drop it. Keep the
verification/deduplication pgo-memop-opt for now to avoid leaving main
in a broken state.
Reviewers: mtrofin, ormris
Pull Request: https://github.com/llvm/llvm-project/pull/197615
[AMDGPU] Fix VOPD assembler validation for GFX12+ (#198034)
The related `codegen` side of this change was already landed by
https://github.com/llvm/llvm-project/commit/c510ee553e2057f94c2f023c72abb3c9afec0962
("[AMDGPU] VOPD: AllowSameVGPR on GFX12"), which changed
`GCNVOPDUtils.cpp` to use `hasGFX12Insts()` instead of
`hasGFX1250Insts()`.
However, the assembler validation in `AMDGPUAsmParser.cpp` was not
updated to match, causing it to reject valid VOPD instruction pairs that
share the same VGPR as src0 on `gfx1200`.
This fix aligns the assembler with the `codegen` by changing
`isGFX1250Plus()` to `isGFX12Plus()` in `checkVOPDRegBankConstraints`,
and adds a positive test case to verify same-VGPR src0 pairs assemble
correctly on `gfx12`.
[Instrumentor] Add call instrumentation support
We can now instrument call instructions and extract information about
the arguments, (de)allocation, intrinsic kind, etc.
[IR] Note that duplicate profile values are illegal in VP metadata
It is not legal to have duplicate VP metadata as it should be merged
appropriately before it actually ends up transcribed into the IR.
I will put up a verifier patch for this to follow this one, but do so
separately in case we need to revert due to detecting actual issues in
the code base.
Reviewers: david-xl, teresajohnson, mtrofin
Pull Request: https://github.com/llvm/llvm-project/pull/193077
[CIR][CUDA] Support device-side printf for NVPTX (#196573)
Implement device-side printf lowering for NVPTX targets in CIR codegen.
The variadic arguments are packed into a stack-allocated struct and
passed to vprintf, matching the classic codegen behavior in
CGGPUBuiltin.cpp
When the target triple is NVPTX and the builtin is
printf/__builtin_printf, we route to emitNVPTXDevicePrintfCallExpr
The no-varargs case passes a null pointer directly.
AMDGCN device printf remains NYI.
part of https://github.com/llvm/llvm-project/issues/179278
[MLIR] Add `IntegerDivisibilityAnalysis` and `InferIntDivisibilityOpInterface` (#197728)
This patch is a port from
https://github.com/iree-org/iree/blob/main/compiler/src/iree/compiler/Dialect/Util/Analysis/IntegerDivisibilityAnalysis.cpp
to upstream
It introduces a dataflow analysis that tracks integer divisibility
(divisor + remainder lattice) for SSA values, plus an op interface
`InferIntDivisibilityOpInterface` for ops to participate.
It adds:
* `IntegerDivisibilityAnalysis` produces a `Divisibility` lattice
`{divisor, remainder}`
* `InferIntDivisibilityOpInterface` interface
* External-model implementations for `arith` and `affine` ops
* `test-int-divisibility` test pass + lit tests
Example:
Here is the usual approach to laod element `i` from `i4` buffer emulated
[11 lines not shown]
[flang][acc] Accept component of global variable in `acc declare` (#197819)
This MR partially extends the current implementation to accept cases of
`acc declare` on a `parent%comp` whenever the `parent` has been `acc
declare`d with the same clause. This is done by generating only the acc
global constructor only for mapping the parent as the child is expected
to be part of parent.
The limitations still remain as a TODO unless it can be proven parent is
mapped. A generic implementation would need either compiler generated
ordering on the global constructors used for mapping or runtime managed
ordering.
[AArch64] Do not pass debug insn to liveness analysis (#198021)
Fix another stepBackward location.
Debug instructions must not affect liveness analysis. stepBackward has
an assertion failure on debug instructions after
https://github.com/llvm/llvm-project/pull/193104.
Signed-off-by: John Lu <John.Lu at amd.com>
[RISCV][MCA] Use the new infrastructure for SiFive P500 and P800's tests. NFC (#198016)
Some tests -- mostly vector crypto -- are kept for SiFive P800.
NFC.
[flang][NFC] Finishing touches on legacy lowering conversion (#197973)
At the beginning of legacy lowering conversion, some tests were
initially converted to emit FIR. After some discussion, it was decided
to revisit those tests and convert them to emit HLFIR. This change
completes that step and should be the final change in removing vestiges
of legacy lowering.
Assisted-by: AI