[CMake][LLVM] Add PCH infrastructure and LLVMSupport PCH (#176420)
This patch implements PCH support. PCH is enabled by default, unless
noted below, and can be disabled with
-DCMAKE_DISABLE_PRECOMPILE_HEADERS=ON.
* Libraries can define precompiled headers using a newly added
PRECOMPILE_HEADERS keyword. If specified, the listed headers will be
compiled into a pre-compiled header using standard CMake mechanisms.
* Libraries that don't define their own PRECOMPILE_HEADERS but directly
depend on a library or component that defines its own PCH will reuse
that PCH. This reuse is not transitive to prevent excessive use of
unrelated headers. If multiple dependencies provide a reusable PCH, the
first one with the longest dependency chain (stored in the CMake target
property LLVM_PCH_PRIORITY) is used. However, due to CMake limitations,
only PCH from targets that are already defined can be reused; therefore
libraries that should reuse a PCH must be defined later in the CMake
file (=> add_subdirectory order matters).
[34 lines not shown]
AMDGPU/GlobalISel: Regbanklegalize rules for INTRIN_IMAGE
Regbanklegalize rules for INTRIN_IMAGE loads and stores.
Because of very large number of different type signatures, rule specifies
only function for lowering (waterfall lowering of RsrcIdx operand if needed)
and this function also applies register banks.
[NFC][SPIRV] Replace uses of `removeFromParent` by `eraseFromParent` (#182330)
`removeFromParent` doesn't deallocate the resources associated with the
`MachineInstruction`.
I was not able to remove all the uses of `removeFromParent` in the file;
but these are the safe ones.
There is an extra advantage with `eraseFromParent`: If we reuse the
deleted instruction the address sanitizer will catch the mistake.
[analyzer] Remove the alpha.core.FixedAddr checker (#182033)
This checker is way to simplistic. It's also alpha. I don't think it's
worth for anyone keeping it, especially that we have an
`optin.core.FixedAddressDereference` checker that largely supresedes
this alpha checker.
I propose the removal of this checker.
Also relates to:
https://github.com/llvm/llvm-project/pull/181858#discussion_r2818756964
[CIR] Handle Type::OverflowBehavior in CIR CodeGen
Add handling for the newly introduced Type::OverflowBehavior type class
in CIRGenItaniumCXXABI.cpp and CIRGenFunction.cpp switch statements to
fix -Werror,-Wswitch compilation errors.
[InstCombine] Do not perform fcmp -> icmp transformation if denormal inputs may be flushed (#181899)
Commit 4827771234276 added the following transformation:
fcmp oeq/une (bitcast X), 0.0 --> (and X, SignMaskC) ==/!= 0
This transformation is only valid if denormal inputs are preserved. If
they are flushed, the two comparisons can return different results.
---------
Co-authored-by: Justin Holewinski <jholewinski at nvidia.com>
[ASan] Fix test IsPoisonedDoesNotCrashOnMemoryBoundaries for 32-bit targets (#182412)
Make sure there is no address space overflow in the test on 32-bit platforms.
32-bit: __asan_region_is_poisoned(0xffffffff, 1) fails since 0xffffffff + 1 = 0x0
[MLIR][Python] Remove redundant methods (#182459)
At the moment, Pylance reports errors in `ir.pyi`, saying that some
overloaded methods are invalid. After looking into it, I found that some
of these methods have duplicate signatures and are defined more than
once. This PR is mainly to clean up those methods.
[mlir][tosa] Improve broadcasting behaviour in elementwise folders (#181114)
This commit aims to improve elementwise folder behaviour when
broadcasting is involved. In particular, it ensures correctness of
folders when the input operands are dynamic and it is not clear whether
broadcasting is involved.
For example, previously, the tosa.add folder could result in shape
information loss:
```
func.func @test(%arg0: tensor<?x4xi32>) -> tensor<?x4xi32> {
%one = "tosa.const"() {values = dense<0> : tensor<2x1xi32>} : () -> tensor<2x1xi32>
%div = tosa.add %one, %arg0 : (tensor<2x1xi32>, tensor<?x4xi32>) -> tensor<?x4xi32>
return %div : tensor<?x4xi32>
}
$ mlir-opt --canonicalize test.mlir
func.func @test(%arg0: tensor<?x4xi32>) -> tensor<?x4xi32> {
[15 lines not shown]
[RISCV] Remove VMConstraint from VAESKF1_VI/VAESKF2_VI. (#181887)
These instructions don't have a VM operand. If these instructions use a
V0 destination, the VMConstraint code calls getReg() on the the last
operand which is an immediate. This triggers an assertion. Not sure
what happens on a release build. It probably treats the immediate as a
value in the RISCV register info enum.
(cherry picked from commit 6eae1759f2c08fcd36b3673c6603297ba3f8d7d3)
[MC][ARM] Don't set funcs to Thumb as a side effect of .hidden (#181156)
When assembling a source file which switches between Arm and Thumb state
using `.arm` and `.thumb`, if you defined a function in Arm state and
mark it as hidden at dynamic link time using `.hidden`, but don't
actually issue the `.hidden` directive until you have switched back to
Thumb state, then the function would be accidentally marked as Thumb as
a side effect of making it hidden.
This happened in `ARMELFStreamer::emitSymbolAttribute`, and the comment
suggests that it was semi-deliberate: it was intended to happen as a
side effect of `.type foo,%function`, because the function label might
have already been defined without a type, and shouldn't be marked as
Thumb until it's known that it's a function. But I think it was an
accident that the same behavior also applies to any other addition of a
symbol attribute, such as `.hidden`: the call to `setIsThumbFunc` was
conditioned on whether the symbol has function type after setting the
attribute, not whether function type was the attribute _actually being
set_. So if you set the symbol to function type and _then_ use
[12 lines not shown]
[MLIR][Python] Fix generic class signature of ir.Value (#182447)
In the type stub, `Generic` isn’t explicitly imported. This causes
Pyright/Pylance to report an error and treat `Value` as not being a
generic type. Explicitly using `typing.Generic` fixes this.
[AArch64][llvm] Tighten SYSP; don't disassemble invalid encodings
Tighten SYSP aliases, so that invalid encodings are disassembled
to `<unknown>`. This is because:
```
Cn is a 4-bit unsigned immediate, in the range 8 to 9
Cm is a 4-bit unsigned immediate, in the range 0 to 7
op1 is a 3-bit unsigned immediate, in the range 0 to 6
op2 is a 3-bit unsigned immediate, in the range 0 to 7
```
Ensure we check this when disassembling, and also constrain
tablegen for compile-time errors of invalid encodings.
Also adjust the testcases in `armv9-sysp-diagnostics.s` and
`llvm/test/MC/AArch64/armv9a-sysp.s` as they were invalid,
and added a few invalid (outside of range) SYSP-alikes to
test that `<unknown>` is printed
[Hexagon] Fix SplitVectors crash in HVX type legalization (#181377)
When LegalizeHvxResize splits a multi-step TL_EXTEND (e.g., v128i32 from
v128i8, which is i8->i32), SplitVectorOp halves both input and output
types. This creates operand types that are half the HVX vector width
(e.g., v64i8 = 512 bits on 128-byte HVX), which are not legal HVX types.
These sub-HVX intermediate types confuse the DAG type legalizer's map
tracking, causing "Unprocessed value in a map! SplitVectors" assertions
with EXPENSIVE_CHECKS or
-enable-legalize-types-checking.
Fix by first expanding multi-step TL_EXTEND/TL_TRUNCATE operations into
a chain of single-step operations via ExpandHvxResizeIntoSteps before
splitting. Each single-step operation (e.g., i16->i32) can be safely
split because halving its operand type produces a legal HVX type (e.g.,
v64i16 = HVX single vector).
(cherry picked from commit 4d3217d68914ddac47d760b215d71441b820720e)
[RISCV] Correct the LMUL operand for __riscv_sf_vc_i_se_u8mf4 and __riscv_sf_vc_i_se_u8mf2 intrinsics. (#182345)
mf2 is should 7 (-1 in 3 bits). mf4 should be 6 (-2 in 3 bits).
(cherry picked from commit d93ad10a2e9fb07132771cc5c9f356d4439c8950)
[PowerPC] Only set QualName symbol on first section switch (#179253)
We were setting it every time when switching to the section. This caused
problems when the debug_aranges emission performed a switch at the end
of the section, resulting in symbols incorrectly pointing to the end
instead of the start of the function.
(cherry picked from commit 90c632ab48748808e95d9bb8cd4f3028888dc1b0)
[Flang-RT][unittests] Fix buffer over-read (#182176)
The unittests `Reductions.InfSums` defines a test array descriptor with
shape 2x3 (i.e. 6 elements), but only provides values for 2 elements.
The result is access of likely uninitialized memory when accessing the
additional 4 elements. In most cases the additional values get gobbled
up by the infinity, but if it happens to be NaN or the negated infinity,
the result becomes NaN and fails the test.
Fix by reducing the shabe of the test array to 2. Fixes the flakyness of
the test of the flang-x86_64-windows buildbot.
[AMDGPU] Update f16 builtin definitions to use _Float16 instead of __fp16 (#182331)
Change the type signature of 16-bit-insts half-precision builtins from
`__fp16` to `_Float16` in the tablegen builtin definitions.
[mlir][Linalg][Tensor] Preserve attrs on `tensor.pad` when lowering to dst-style (#182064)
When canonicalizing to generic ops within `EliminateEmptyTensors`, we
should take care to preserve the attributes. For example, this attribute
mechanism is employed within IREE's SPIRV pipeline to pass on tiling
configurations together with the ops.
---------
Signed-off-by: Artem Gindinson <gindinson at roofline.ai>