[mlir][linalg] Lower unpack - capture handle to created copy op (#183744)
Adds missing copy op created to unpack lowering results. Corresponding
transform op is also updated with the new result value.
[RISCV][NFC] Prepare for Short Forward Branch of branches with immediates (#182456)
This NFC patch introduces two key updates:
- It replaces the `gpr` operand type with `sfb_rhs` for the `rhs`
operand in the short forward branch optimization pseudos. The `sfb_rhs`
type supports both register and immediate operands.
- It updates the pseudos to use branch opcodes instead of condition
codes, which were used prior to this change.
Together, these changes prepare the existing codebase to support short
forward branches that compare a register with an immediate value.
Currently, short forward branch support is limited to
register-to-register comparisons
[RISCV] Handle Zvabd and XRivosVizip EEWs in RISCVVLOptimizer (#184117)
This allows the VL optimizer to handle more cases that
RISCVVectorPeephole currently catches.
The XRivosVizip instructions have ReadsPastVL=true, so only the vl of
the zip instruction itself is reduced, not its inputs.
[llvm] Turn misc copy-assign to move-assign (#184143)
That's an automated patch generated from clang-tidy
performance-use-std-move as a follow-up to #184136
[CIR] Split cir.binop into separate per-operation binary ops
LLVM lowering uses per-op patterns generated by the CIRLowering.inc TableGen
infrastructure instead of a monolithic TypeSwitch dispatch.
[AMDGPU] Make the options consistent across 3 RA pipelines(NFC) (#184190)
Adding the missing option for the wwm-regalloc in the test
attr-amdgpu-flat-work-group-size-vgpr-limit.ll. The existing
test already specifies -sgpr-regalloc=fast & -vgpr-regalloc=fast
to ensure that the fast register allocator is preferred over
the default greedy allocator. For consistency, the same
preference should also be applied to the wwm-regalloc pipeline.
[LegalizeVectorOps][RISCV][PowerPC][AArch64][X86] Enable the clmul/clmulr/clmulh expansion code. (#184257)
These opcodes weren't added to the master switch statement that
determines if they should be considered vector ops.
[LIT] Use forward slashes in substitutions when LLVM_WINDOWS_PREFER_FORWARD_SLASH is set (#179865)
When building with `-DLLVM_WINDOWS_PREFER_FORWARD_SLASH=ON`, tools like
lld output paths with forward slashes on Windows. However, lit's default
substitutions (`%t`, `%p`) typically use backslashes on Windows, causing
FileCheck failures in tests that strictly match path separators.
This patch propagates the `LLVM_WINDOWS_PREFER_FORWARD_SLASH` build flag
to llvm-lit via `builtin_parameters`. It also updates lit's TestRunner
to respect the 'use_normalized_slashes' parameter. When enabled, lit
normalizes paths in substitutions to use forward slashes, ensuring that
test expectations align with the tool output.
With this fix, the number of failed tests with
`-DLLVM_WINDOWS_PREFER_FORWARD_SLASH=ON` changes as follow:
- The total number of failed tests: 303 -> 168
- Break down:
- `Builtins-i386-windows` tests: 99 -> 0
[9 lines not shown]
[RISCV] Extends RISCVMoveMerger to merge GPRPairs independent of even/odd pair instruction order. (#183657)
This PR addresses post-commit reviews in #182416
Previously, `RISCVMoveMerger` only identified and merged 32-bit moves
into a 64-bit GPRPair move if the even-indexed register most appeared
before the odd-index register move.
This patch extends the pass by disregarding the order of even/odd-index
pair.
[OpenMP][clang] Indirect and Virtual function call mapping from host to device (#159857)
This patch implements the CodeGen logic for calling __llvm_omp_indirect_call_lookup
on the device when an indirect function call or a virtual function call is made
within an OpenMP target region.
---------
Co-authored-by: Youngsuk Kim
[AMDGPU] Insert readfirstlane for uniform VGPR arguments (#178198)
Fix inreg argument, which is uniform, but using VGPR due to run out of
SGPR.
---------
Co-authored-by: Matt Arsenault <Matthew.Arsenault at amd.com>
[HLSL] Add globals for resources embedded in structs
For each resource or resource array member of a struct declared
at global scope or inside a cbuffer, create an implicit global
variable of the same resource type. The variable name will be
derived from the struct instance name and the member name.
The new global is associated with the struct declaration using
a new attribute HLSLAssociatedResourceDeclAttr.
Closes #182988
[mlir][acc] Add ACCRecipeMaterialization pass and reduction ops (#184252)
Pass
----
Add the `acc-recipe-materialization` pass, which materializes OpenACC
privatization, firstprivate and reduction recipes by inlining their
init, copy, combiner, and destroy regions into the operation for the
construct. The pass runs on acc.parallel, acc.serial, acc.kernels, and
acc.loop.
- Firstprivate: Inserts acc.firstprivate_map so the initial value is
available on the device, then clones the recipe init and copy regions
into the construct and replaces uses with the materialized alloca.
Optional destroy region is cloned before the region terminator.
- Private: Clones the recipe init region into the construct (at region
entry or at the loop op for acc.loop private). Replaces uses of the
recipe result with the materialized alloca. Optional destroy region is
cloned before the region terminator.
[42 lines not shown]
[Github] Respect LLVM_VERSION when building windows container (#184231)
Otherwise setting LLVM_VERSION does not actually do anything. This
avoids needing to update ~8 different locations in the file when doing a
toolchain bump to just 1 place.
[Github] Bump Github Runner to v2.332.0 (#184230)
To stay ahead of the support horizon. There were no major feature
changes/bug fixes from a cursory glance at the release notes.
[Clang] Add missing extension cl_intel_split_work_group_barrier declaration (#184269)
All the OpenCL extensions must be declared in OpenCLExtensions.def,
otherwise the frontend won't recognize them and won't be able to use
them in the code. This patch adds the missing declaration for the
`cl_intel_split_work_group_barrier` extension.