[Linalg] Add *Pooling* matchers (#172351)
-- This commit is the eighth in the series of adding matchers
for linalg.*conv*/*pool*. Refer:
https://github.com/llvm/llvm-project/pull/163724
-- In this commit all variants of Pooling ops have been added.
Signed-off-by: Abhishek Varma <abhvarma at amd.com>
[ORC] Initialize the native target in ReOptimize unit test. (#172955)
The ReOptimize unit test was accidentally depending on native target
initialization in previous tests, causing it to be skipped if run on its
own (using --gtest_filter).
Calling OrcNativeTarget::initialize() in the SetUp method of the test
fixes this.
[MLIR] Add DefaultValuedEnumAttr decorator (#172916)
Introduce DefaultValuedEnumAttr, which similarly to DefaultValuedAttr
decorates an enum attribute to have a default value from a specific enum
case when not present. The default is constructed as the fully-qualified
enum case symbol.
In comparison to DefaultValuedAttr, this allows using a TableGen
EnumCase
variable instead of a raw string.
[MemProf] Update metadata verification for a single string tag (#172543)
The memprof metadata verifier supported multiple string tags, but in
reality, the other code (e.g. addCallStack) only supports a single such
tag. Update the verifier to reflect that limitation, and the associated
tests.
Fixes #157217
[NFC][llvm-ir2vec] llvm_ir2vec.cpp breakup to extract a reusable header for IR2VecTool, and MIR2VecTool classes (#172304)
Refactor llvm-ir2vec: Extract reusable header for Python bindings
Separated the IR2Vec/MIR2Vec tool implementation into a header file
(`llvm-ir2vec.h`) and implementation file (`llvm-ir2vec.cpp`) to enable
reuse in Python bindings and other projects.
Changes
- **Created `llvm-ir2vec.h`**: Contains `IR2VecTool` and `MIR2VecTool`
class definitions with all implementations, making it a standalone
header-only library
- **Simplified `llvm-ir2vec.cpp`**: Now contains only command-line
interface code (options, main function, and helper functions)
Motivation
The original monolithic `.cpp` file made it impossible to use
IR2Vec/MIR2Vec functionality in Python bindings without compiling the
entire command-line tool. This refactoring enables clean separation
[5 lines not shown]
[CodeGen][NPM] Update MPDT similar to MDT after unreachable BB elimination (#172421)
After unreachable machine basic blocks are removed, MPDT should also be
updated with the latest block numbers alongside MDT.
[CodeGenPrepare][NPM] Remove incorrect LoopAnalysis preservation in CodeGenPrepare (#172418)
CodeGenPrepare modifies and restructures loops & control flow. So, it
shouldn't preserve LoopAnalysis.
The test `llvm/test/CodeGen/AMDGPU/cf-loop-on-constant.ll` shows
CodeGenPrepare modifying loop structure, hence we cannot preserve
LoopAnalysis.
[DirectX] Move memset and memcpy handling to a new pass. NFC (#172921)
This introduces the DXILMemIntrinsics pass and moves memset and memcpy
handling from DXILLegalize to here. We need to do this so that we can
handle memory intrinsics before the DXILResourceAccess pass so that we
can properly deal with arrays and large structures in resources.
[RISCV][llvm] Remove custom legalization of fixed-length vector SPLAT_VECTOR (#172870)
BUILD_VECTOR is combined to SPLAT_VECTOR if operation action of
SPLAT_VECTOR is not Expand. However we already have custom handle of
BUILD_VECTOR for fixed-length vector which has explicit constant VL
instead of making it VLMAX if lowered through SPLAT_VECTOR.
[NVPTX] Add support for barrier.cta.red.* instructions (#172541)
This change adds full support for the ptx `barrier.cta.red` instruction,
following the same conventions as are already used for
`barrier.cta.sync` and `barrier.cta.arrive`.
In addition this MR removes the following intrinsics which are no longer
needed:
* llvm.nvvm.barrier0.popc -->
llvm.nvvm.barrier.cta.red.popc.aligned.all(0, c)
* llvm.nvvm.barrier0.and -->
llvm.nvvm.barrier.cta.red.and.aligned.all(0, z)
* llvm.nvvm.barrier0.or -->
llvm.nvvm.barrier.cta.red.or.aligned.all(0, z)
[CIR] Upstream convert to mask builtins in CIR codegen (#171694)
This PR is part of https://github.com/llvm/llvm-project/issues/167752.
It upstreams the codegen and tests for the convert to mask builtins
implemented in the incubator, including:
Upstream X86 mask conversion builtins from clangir:
- cvtmask2b/w/d/q*
- cvtb/w/d/q2mask*
Upstreamed helpers:
- emitX86MaskedCompare()
- emitX86ConvertToMask()
- emitX86SExtMask()
[MLIR] Enable dylib init/deinit in execution engine on AArch64 platform (#172833)
This PR enables JIT initialize for AArch64. Up to now it was disabled
because of #71963 which was recently fixed by #71968.
[libclc][NFC] Move convert builtins from Python generator to .cl sources (#172634)
Remove the Python dependency for generating convert builtins, aligning
with how other builtins are defined.
In addition, our downstream target relies on this PR to override convert
implementations.
llvm-diff shows no changes to all bitcodes:
amdgcn--amdhsa.bc, barts-r600--.bc, cayman-r600--.bc, cedar-r600--.bc,
clspv64--.bc, clspv--.bc, cypress-r600--.bc, nvptx64--.bc,
nvptx64--nvidiacl.bc, nvptx--.bc, nvptx--nvidiacl.bc, tahiti-amdgcn--.bc
and tahiti-amdgcn-mesa-mesa3d.bc.
[libc++] Add missing %{flags} substitution to clang-tidy (#171689)
Flags that should be used both for compiling and for linking are
provided through the %{flags} substitution. Our clang-tidy tests should
be using them, not only %{compile_flags}.