[TargetLowering] Refactor expandDIVREMByConstant to share more code. NFC (#187582)
Make the (1 << HBitWidth) % Divisor == 1 path a special case within
the recently added chunk summing algorithm. This allows us to
share the trailing zero shifting code.
While there make some comment improvements and avoid creating
unnecessary nodes.
[RISCV][Docs] Removed 'specified in' text from SiFive custom instruction links. NFC (#187817)
The URL isn't printed, the text in the backticks is the link text.
Fix declare simd linear stride rescaling and arg_types verifier
1. Rescale constant linear steps from source-level element counts to byte
strides in Flang's processLinear(). For reference-like parameters
(pointers or non-VALUE dummy arguments) with Linear or LinearRef ABI
kind, the step must be multiplied by the element size in bytes. This
matches Clang's rescaling in CGOpenMPRuntime.cpp. Val and UVal kinds
are not rescaled as they describe value changes, not pointer strides.
Var-strides are also not rescaled as the value is an argument index.
2. Add a verifier check in DeclareSimdOp to ensure 'arg_types' length
matches the number of function arguments, preventing out-of-bounds
access during MLIR-to-LLVM IR translation.
Also restructure processLinear() to compute stepOperand per-variable
instead of appending the same operand for all objects in the clause,
enabling per-variable rescaling.
Assisted with copilot.
[Clang][AMDGPU] Lower `__amdgpu_texture_t` to `<8 x i32>` instead of ptr adrspace(0) (#187774)
Fix the IR lowering for `__amdgpu_texture_t` to generate a single
256-bit load instead of a double indirection through a flat pointer.
Previously, `__amdgpu_texture_t` was lowered to `ptr addrspace(0)`
(64-bit flat pointer), which caused the double load and indirection.
With the same reproducer like #187697.
```c
#define TSHARP __constant uint *
// Old tsharp handling:
// #define LOAD_TSHARP(I) *(__constant uint8 *)I
#define LOAD_TSHARP(I) *(__constant __amdgpu_texture_t *)I
float4 test_image_load_1D(TSHARP i, int c) {
return __builtin_amdgcn_image_load_1d_v4f32_i32(15, c, LOAD_TSHARP(i), 0, 0);
[24 lines not shown]
[MLIR][Python] Make init parameters follow the field definition order (#186574)
Currently, Python-defined operations automatically generate an
`__init__` function to serve as the operation builder. Previously, the
parameters of this `__init__` function followed a fairly complex set of
rules. For example:
* All result fields were moved to the front to align with other op
builders.
* Fields of `Optional` type were automatically moved to the end and
treated as keyword parameters.
* If the types of all result fields could be inferred automatically,
then all result fields were removed from the parameter list.
* Other than that, the parameter order followed the field definition
order.
These rules may seem reasonable, and they have worked well in practice,
but they have one major drawback: users cannot easily tell what the
actual `__init__` parameter list will look like when writing code,
[28 lines not shown]
[llvm-pdbutil] Hash type records in yaml2pdb (#187593)
The TPI and IPI streams didn't include a hash map for the generated
types, because the types never got hashes. A hash map is necessary to
resolve forward references when dumping the PDB (checks for
`TpiStream::supportsTypeLookup` which checks the hash map).
With this PR, the hashes are generated. There's no good test that we do
this for the IPI stream as well, because it doesn't have forward
references.
[TargetLowering] Use legally typed shifts to split chunks in expandDIVREMByConstant. (#187567)
This replaces LegalVT with HiLoVT and LegalWidth with HBitWidth as
they are the same for all current uses.
Then we rewrite the shifts to operate on LL and LH.
There's a slight regression on RISC-V due to different node creation
order leading to different DAG combine order. I have other refactoring
I'd like to explore then I may try to fix that.
[libclang/python] Fix Type.get_offset annotation (#187841)
As discussed in
https://github.com/llvm/llvm-project/pull/180876#discussion_r2934372753,
`Type.get_offset` can process `bytes` arguments as well. For consistency
with other functions taking `str` arguments, its type annotation is
adapted to reflect this.
[OpenMP][flang] Fix crash in host offload
Guard `getGridValue` in `OMPIRBuilder` to avoid reaching the
`unreachable` in `getGridValue` when offloading to host device without
an explicit num_threads clause.
Reproducer (`-fopenmp -fopenmp-targets=x86_64-unknown-linux-gnu`):
```
program test
implicit none
!$omp target
!$omp end target
end program test
```
(Note: the linker still fails, but that's another issue.)
[lldb] Fix LLVMSupportHTTP linkage against libLLVM (#187848)
Regression introduced in 39d6bb21804d21abe2fa0ec019919d72104827ac.
Signed-off-by: Michał Górny <mgorny at gentoo.org>
[flang][OpenMP] Provide reasons for calculated sequence length
If the length was limited by some factor, include the reason for what
caused the reduction.
Issue: https://github.com/llvm/llvm-project/issues/185287
[SLP]Do not consider copyable node with SplitVectorize parent
If the copyables are schedulable and the parent node is plit vectorize,
need to skip the scheduling analysis for such nodes to avoid a compiler
crash
[CIR] Add RecursiveMemoryEffects to region-bearing ops
Add the RecursiveMemoryEffects trait to cir.if, cir.case, loop ops
(cir.while/cir.do/cir.for), cir.ternary, cir.await,
cir.array.ctor, cir.array.dtor, and cir.try. Without this trait,
MLIR conservatively assumes unknown memory effects for ops with
regions, preventing DCE of ops whose bodies are provably pure.
Also fix a crash in ConditionOp::getSuccessorRegions where the
missing early return after the loop-op case would fall through to
cast<AwaitOp>(...), which asserts when the parent is a loop rather
than an await op.
Add tests verifying that region ops with pure bodies are eliminated
and ops with stores or calls are preserved, including two-level nested
propagation (cir.if inside cir.while).
[libclang/python] export libclang version to the bindings (#86931)
It's useful to know which clang library the python bindings are running.
---------
Co-authored-by: Vlad Serebrennikov <serebrennikov.vladislav at gmail.com>
[lldb] Fix linking liblldb in a dylib build after 39d6bb21804d21ab
Referencing libSupportHTTP under LINK_LIBS of add_lldb_library() pulls
in the static archive even in a build configuration with
LLVM_LINK_LLVM_DYLIB=On, where libSupportHTTP is part of libLLVM. This
patch moves it to LINK_COMPONENTS to fix the issue.
This is the same fix as in
036429881f8d3037894042c6268b2a94eac8c950, applied on another
library.
[CIR] Fix reference alignment to use pointee type (#186667)
getNaturalTypeAlignment on a reference type returned pointer alignment
instead of pointee alignment. Pass the pointee type with
forPointeeType=true to match traditional codegen's
getNaturalPointeeTypeAlignment behavior. Fix applies to both argument
and return type attribute construction paths.
InstCombine: Fold out nanless canonicalize pattern
Pattern match a wrapper around llvm.canonicalize which
weakens the semantics to not require quieting signaling
nans. Depending on the denormal mode and FP type, we can
either drop the pattern entirely or reduce it only to
a canonicalize call. I'm inventing this pattern to deal
with LLVM's lax canonicalization model in math library
code.
The math library code currently has explicit checks for
the denormal mode, and conditionally canonicalizes the
result if there is flushing. Semantically, this could be
directly replaced with a simple call to llvm.canonicalize,
but doing so would incur an additional cost when using
standard IEEE behavior. If we do not care about quieting
a signaling nan, this should be a no-op unless the denormal
mode may flush. This will allow replacement of the
conditional code with a zero cost abstraction utility
[17 lines not shown]
[CIR] Fix reference alignment to use pointee type
getNaturalTypeAlignment on a reference type returned pointer alignment
instead of pointee alignment. Pass the pointee type with
forPointeeType=true to match traditional codegen's
getNaturalPointeeTypeAlignment behavior. Fix applies to both argument
and return type attribute construction paths.
[OpenMP][flang] Fix crash in host offload
Guard `getGridValue` in `OMPIRBuilder` to avoid reaching the
`unreachable` in `getGridValue` when offloading to host device without
an explicit num_threads clause.
Reproducer (`-fopenmp -fopenmp-targets=x86_64-unknown-linux-gnu`):
```
program test
implicit none
!$omp target
!$omp end target
end program test
```
(Note: the linker still fails, but that's another issue.)