LLVM/project 0fff939mlir/include/mlir/Dialect/Linalg/TransformOps LinalgTransformOps.td, mlir/include/mlir/Dialect/Linalg/Transforms Transforms.h

[mlir][linalg] Lower unpack - capture handle to created copy op (#183744)

Adds missing copy op created to unpack lowering results. Corresponding
transform op is also updated with the new result value.
DeltaFile
+18-9mlir/test/Dialect/Linalg/transform-lower-pack.mlir
+4-3mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
+4-2mlir/test/Dialect/Linalg/transform-tile-and-fuse-pack-unpack.mlir
+4-2mlir/lib/Dialect/Linalg/Transforms/Transforms.cpp
+2-1mlir/test/Integration/Dialect/Linalg/CPU/pack-unpack-mmt4d.mlir
+2-1mlir/include/mlir/Dialect/Linalg/Transforms/Transforms.h
+34-181 files not shown
+35-187 files

LLVM/project 78ac964llvm/lib/Target/RISCV RISCVInstrInfo.cpp RISCVInstrInfoSFB.td, llvm/test/CodeGen/RISCV opt-w-instrs.mir

[RISCV][NFC] Prepare for Short Forward Branch of branches with immediates (#182456)

This NFC patch introduces two key updates:

- It replaces the `gpr` operand type with `sfb_rhs` for the `rhs`
operand in the short forward branch optimization pseudos. The `sfb_rhs`
type supports both register and immediate operands.
- It updates the pseudos to use branch opcodes instead of condition
codes, which were used prior to this change.

Together, these changes prepare the existing codebase to support short
forward branches that compare a register with an immediate value.
Currently, short forward branch support is limited to
register-to-register comparisons
DeltaFile
+140-29llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
+69-45llvm/lib/Target/RISCV/RISCVInstrInfoSFB.td
+34-35llvm/lib/Target/RISCV/RISCVExpandPseudoInsts.cpp
+12-12llvm/lib/Target/RISCV/RISCVOptWInstrs.cpp
+6-9llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+4-4llvm/test/CodeGen/RISCV/opt-w-instrs.mir
+265-1343 files not shown
+277-1379 files

LLVM/project f67c2cdllvm/lib/Target/RISCV RISCVVLOptimizer.cpp, llvm/test/CodeGen/RISCV/rvv vl-opt.mir fixed-vectors-shuffle-deinterleave2.ll

[RISCV] Handle Zvabd and XRivosVizip EEWs in RISCVVLOptimizer (#184117)

This allows the VL optimizer to handle more cases that
RISCVVectorPeephole currently catches.

The XRivosVizip instructions have ReadsPastVL=true, so only the vl of
the zip instruction itself is reduced, not its inputs.
DeltaFile
+164-1llvm/test/CodeGen/RISCV/rvv/vl-opt.mir
+16-0llvm/lib/Target/RISCV/RISCVVLOptimizer.cpp
+2-2llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-deinterleave2.ll
+2-2llvm/test/CodeGen/RISCV/rvv/vabd.ll
+2-2llvm/test/CodeGen/RISCV/rvv/vabdu.ll
+186-75 files

LLVM/project 0504af9llvm/lib/Analysis InlineCost.cpp CmpInstAnalysis.cpp, llvm/lib/IR ConstantRange.cpp

[llvm] Turn misc copy-assign to move-assign (#184143)

That's an automated patch generated from clang-tidy
performance-use-std-move as a follow-up to #184136
DeltaFile
+6-6llvm/lib/IR/ConstantRange.cpp
+6-6llvm/lib/Analysis/InlineCost.cpp
+6-6llvm/lib/Support/APFixedPoint.cpp
+5-5llvm/lib/Analysis/CmpInstAnalysis.cpp
+4-4llvm/lib/Target/X86/X86ISelLowering.cpp
+3-3llvm/tools/llvm-readobj/ELFDumper.cpp
+30-3016 files not shown
+53-5322 files

LLVM/project 00df973clang-tools-extra/clang-doc JSONGenerator.cpp, clang-tools-extra/clang-doc/assets/md namespace-template.mustache

review comments, format
DeltaFile
+22-25clang-tools-extra/clang-doc/JSONGenerator.cpp
+10-8clang-tools-extra/test/clang-doc/namespace.cpp
+4-10clang-tools-extra/test/clang-doc/enum.cpp
+5-5clang-tools-extra/test/clang-doc/templates.cpp
+4-4clang-tools-extra/test/clang-doc/basic-project.mustache.test
+1-1clang-tools-extra/clang-doc/assets/md/namespace-template.mustache
+46-531 files not shown
+47-547 files

LLVM/project f9c3755clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/Lowering/DirectToLLVM LowerToLLVM.cpp

[CIR] Split cir.binop into separate per-operation binary ops

LLVM lowering uses per-op patterns generated by the CIRLowering.inc TableGen
infrastructure instead of a monolithic TypeSwitch dispatch.
DeltaFile
+491-491clang/test/CIR/CodeGenBuiltins/X86/avx512dq-builtins.c
+241-56clang/include/clang/CIR/Dialect/IR/CIROps.td
+124-124clang/test/CIR/CodeGen/complex-mul-div.cpp
+129-106clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+92-92clang/test/CIR/CodeGenBuiltins/X86/sse41-builtins.c
+73-73clang/test/CIR/CodeGenOpenACC/private-clause-pointer-array-recipes-CtorDtor.cpp
+1,150-942102 files not shown
+2,338-2,094108 files

LLVM/project e4def2dllvm/test/CodeGen/AMDGPU attr-amdgpu-flat-work-group-size-vgpr-limit.ll

[AMDGPU] Make the options consistent across 3 RA pipelines(NFC) (#184190)

Adding the missing option for the wwm-regalloc in the test
attr-amdgpu-flat-work-group-size-vgpr-limit.ll. The existing
test already specifies -sgpr-regalloc=fast & -vgpr-regalloc=fast
to ensure that the fast register allocator is preferred over
the default greedy allocator. For consistency, the same
preference should also be applied to the wwm-regalloc pipeline.
DeltaFile
+10-10llvm/test/CodeGen/AMDGPU/attr-amdgpu-flat-work-group-size-vgpr-limit.ll
+10-101 files

LLVM/project d20395cllvm/test/CodeGen/AArch64 clmul-fixed.ll, llvm/test/CodeGen/PowerPC clmul-vector.ll

[LegalizeVectorOps][RISCV][PowerPC][AArch64][X86] Enable the clmul/clmulr/clmulh expansion code. (#184257)

These opcodes weren't added to the master switch statement that
determines if they should be considered vector ops.
DeltaFile
+11,541-22,066llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+2,945-2,952llvm/test/CodeGen/X86/clmul-vector.ll
+3,042-2,286llvm/test/CodeGen/PowerPC/clmul-vector.ll
+1,138-1,324llvm/test/CodeGen/AArch64/clmul-fixed.ll
+1,158-1,078llvm/test/CodeGen/X86/clmul-vector-512.ll
+1,003-1,013llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+20,827-30,7194 files not shown
+22,329-32,40410 files

LLVM/project 4ea39c4llvm/utils/lit/lit TestRunner.py LitConfig.py, llvm/utils/lit/lit/llvm config.py

[LIT] Use forward slashes in substitutions when LLVM_WINDOWS_PREFER_FORWARD_SLASH is set (#179865)

When building with `-DLLVM_WINDOWS_PREFER_FORWARD_SLASH=ON`, tools like
lld output paths with forward slashes on Windows. However, lit's default
substitutions (`%t`, `%p`) typically use backslashes on Windows, causing
FileCheck failures in tests that strictly match path separators.

This patch propagates the `LLVM_WINDOWS_PREFER_FORWARD_SLASH` build flag
to llvm-lit via `builtin_parameters`. It also updates lit's TestRunner
to respect the 'use_normalized_slashes' parameter. When enabled, lit
normalizes paths in substitutions to use forward slashes, ensuring that
test expectations align with the tool output.

With this fix, the number of failed tests with
`-DLLVM_WINDOWS_PREFER_FORWARD_SLASH=ON` changes as follow:

- The total number of failed tests: 303 -> 168
- Break down:
  - `Builtins-i386-windows` tests: 99 -> 0

    [9 lines not shown]
DeltaFile
+40-9llvm/utils/lit/lit/TestRunner.py
+10-3llvm/utils/lit/tests/lit.cfg
+8-0llvm/utils/lit/lit/llvm/config.py
+5-0llvm/utils/llvm-lit/llvm-lit.in
+2-0llvm/utils/lit/lit/LitConfig.py
+1-1llvm/utils/lit/tests/shtest-readfile.py
+66-136 files

LLVM/project 75b0cf3llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV rv64p.ll

[RISCV] Add scalar saturating add/sub operations for i32 for RV64P (#184062)

DeltaFile
+37-0llvm/test/CodeGen/RISCV/rv64p.ll
+22-10llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+59-102 files

LLVM/project 84d0f87clang/lib/Headers CMakeLists.txt

[RISCV] Alphabetize riscv_files in clang/lib/Headers/CMakeLists.txt. NFC (#184024)

DeltaFile
+5-3clang/lib/Headers/CMakeLists.txt
+5-31 files

LLVM/project 30fc31allvm/include/llvm/TableGen CodeGenHelpers.h

[NFC][TableGen] Add deleted copy operations for RAII guard classes (#184168)

DeltaFile
+16-0llvm/include/llvm/TableGen/CodeGenHelpers.h
+16-01 files

LLVM/project eba4a76compiler-rt/test/cfi/icall bad-signature.c

[CFI] Expand test to include minimal runtime (#183646)

`ubsan_minimal` contains some CFI tests, but it would be nice have one
on CFI side.
DeltaFile
+6-0compiler-rt/test/cfi/icall/bad-signature.c
+6-01 files

LLVM/project da8929butils/bazel/llvm-project-overlay/mlir BUILD.bazel

[bazel][mlir][acc] Port e63e55cae8ce29150f38a758555d9cc712a1cf4c (#184289)

Co-authored-by: Pranav Kant <prka at google.com>
DeltaFile
+4-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+4-01 files

LLVM/project a85dbcfclang/lib/AST/ByteCode Compiler.cpp

[clang][bytecode] Reject non-VarDecl DeclRefExprs (#184141)

I have no idea how to test this, but this is what the current
interpreter does.
DeltaFile
+1-1clang/lib/AST/ByteCode/Compiler.cpp
+1-11 files

LLVM/project 5a53fcellvm/lib/Target/RISCV RISCVMoveMerger.cpp, llvm/test/CodeGen/RISCV double-convert.ll double-mem.ll

[RISCV] Extends RISCVMoveMerger to merge GPRPairs independent of even/odd pair instruction order. (#183657)

This PR addresses post-commit reviews in #182416

Previously, `RISCVMoveMerger` only identified and merged 32-bit moves
into a 64-bit GPRPair move if the even-indexed register most appeared
before the odd-index register move.

This patch extends the pass by disregarding the order of even/odd-index
pair.
DeltaFile
+117-60llvm/lib/Target/RISCV/RISCVMoveMerger.cpp
+2-4llvm/test/CodeGen/RISCV/double-convert.ll
+2-4llvm/test/CodeGen/RISCV/double-mem.ll
+1-2llvm/test/CodeGen/RISCV/double-stack-spill-restore.ll
+1-2llvm/test/CodeGen/RISCV/double-select-fcmp.ll
+1-2llvm/test/CodeGen/RISCV/copysign-casts.ll
+124-746 files

LLVM/project 198f85eclang/lib/AST/ByteCode InterpBuiltin.cpp, clang/test/CodeGenCXX pfp-member-pointer-offsetof.cpp

[clang][bytecode] Fix newly added pfp test (#184137)

Do the same thing 370d7ce58011eccfab8105eddbc028cc09c4c5e5 did in
ExprConstant.cpp
DeltaFile
+1-0clang/test/CodeGenCXX/pfp-member-pointer-offsetof.cpp
+1-0clang/lib/AST/ByteCode/InterpBuiltin.cpp
+2-02 files

LLVM/project b234386offload/test/api omp_virtual_func_multiple_inheritance_02.cpp omp_virtual_func_multiple_inheritance_01.cpp

[OpenMP][clang] Indirect and Virtual function call mapping from host to device (#159857)

This patch implements the CodeGen logic for calling __llvm_omp_indirect_call_lookup
on the device when an indirect function call or a virtual function call is made
within an OpenMP target region.
---------
Co-authored-by: Youngsuk Kim
DeltaFile
+403-0offload/test/api/omp_virtual_func_multiple_inheritance_02.cpp
+400-0offload/test/api/omp_virtual_func_multiple_inheritance_01.cpp
+322-0offload/test/api/omp_indirect_func_struct.c
+153-0offload/test/api/omp_virtual_func.cpp
+124-0offload/test/api/omp_indirect_func_array.c
+95-0offload/test/api/omp_indirect_func_basic.c
+1,497-014 files not shown
+1,808-120 files

LLVM/project 908782fclang/lib/Sema SemaHLSL.cpp

Reorder and format
DeltaFile
+47-53clang/lib/Sema/SemaHLSL.cpp
+47-531 files

LLVM/project 572a0e4llvm/lib/Target/AMDGPU SIInstrInfo.cpp

AMDGPU: Remove "MBUF" from "loadMBUFScalarOperandsFromVGPR" (#184282)

There is nothing MBUF-specific about this function.
DeltaFile
+11-12llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+11-121 files

LLVM/project 6d25af0llvm/utils/lit/lit TestRunner.py display.py

[utils] use annotations from __future__ in lit (#184225)

DeltaFile
+4-6llvm/utils/lit/lit/TestRunner.py
+3-3llvm/utils/lit/lit/display.py
+7-92 files

LLVM/project 768240dllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

[AMDGPU] Insert readfirstlane for uniform VGPR arguments (#178198)

Fix inreg argument, which is uniform, but using VGPR due to run out of
SGPR.

---------

Co-authored-by: Matt Arsenault <Matthew.Arsenault at amd.com>
DeltaFile
+84,419-78,498llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+25,751-24,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+23,663-20,281llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+21,867-18,577llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+19,112-16,445llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+17,646-15,131llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+192,458-173,71432 files not shown
+244,721-216,71538 files

LLVM/project 43a2695llvm/lib/Target/AMDGPU SIInstrInfo.cpp

AMDGPU: Remove "MBUF" from "loadMBUFScalarOperandsFromVGPR"

There is nothing MBUF-specific about this function.

commit-id:3c711dc9
DeltaFile
+11-12llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+11-121 files

LLVM/project 1a3c736clang/include/clang/AST HLSLResource.h, clang/include/clang/Basic Attr.td

[HLSL] Add globals for resources embedded in structs

For each resource or resource array member of a struct declared
at global scope or inside a cbuffer, create an implicit global
variable of the same resource type. The variable name will be
derived from the struct instance name and the member name.

The new global is associated with the struct declaration using
a new attribute HLSLAssociatedResourceDeclAttr.

Closes #182988
DeltaFile
+163-8clang/lib/Sema/SemaHLSL.cpp
+167-0clang/test/AST/HLSL/resources-in-structs.hlsl
+46-0clang/lib/AST/HLSLResource.cpp
+34-0clang/include/clang/AST/HLSLResource.h
+8-6clang/include/clang/Sema/SemaHLSL.h
+8-0clang/include/clang/Basic/Attr.td
+426-143 files not shown
+440-149 files

LLVM/project 99a6b3eclang-tools-extra/clang-doc/assets/md class-template.mustache namespace-template.mustache, clang-tools-extra/test/clang-doc enum.cpp templates.cpp

fix conflicts and update tests
DeltaFile
+4-10clang-tools-extra/test/clang-doc/enum.cpp
+5-5clang-tools-extra/test/clang-doc/templates.cpp
+1-1clang-tools-extra/clang-doc/assets/md/class-template.mustache
+1-1clang-tools-extra/clang-doc/assets/md/namespace-template.mustache
+11-174 files

LLVM/project e63e55cflang/test/Transforms/OpenACC acc-recipe-materialization-firstprivate-derived.fir, mlir/include/mlir/Dialect/OpenACC OpenACCCGOps.td

[mlir][acc] Add ACCRecipeMaterialization pass and reduction ops (#184252)

Pass
----
Add the `acc-recipe-materialization` pass, which materializes OpenACC
privatization, firstprivate and reduction recipes by inlining their
init, copy, combiner, and destroy regions into the operation for the
construct. The pass runs on acc.parallel, acc.serial, acc.kernels, and
acc.loop.

- Firstprivate: Inserts acc.firstprivate_map so the initial value is
available on the device, then clones the recipe init and copy regions
into the construct and replaces uses with the materialized alloca.
Optional destroy region is cloned before the region terminator.

- Private: Clones the recipe init region into the construct (at region
entry or at the loop op for acc.loop private). Replaces uses of the
recipe result with the materialized alloca. Optional destroy region is
cloned before the region terminator.

    [42 lines not shown]
DeltaFile
+459-0mlir/lib/Dialect/OpenACC/Transforms/ACCRecipeMaterialization.cpp
+59-40mlir/lib/Dialect/OpenACC/Utils/OpenACCUtilsLoop.cpp
+86-0mlir/unittests/Dialect/OpenACC/OpenACCUtilsLoopTest.cpp
+66-0mlir/include/mlir/Dialect/OpenACC/OpenACCCGOps.td
+63-0mlir/lib/Dialect/OpenACC/IR/OpenACCCG.cpp
+60-0flang/test/Transforms/OpenACC/acc-recipe-materialization-firstprivate-derived.fir
+793-4017 files not shown
+1,329-4123 files

LLVM/project 92aa2d3.github/workflows/containers/github-action-ci-windows Dockerfile

[Github] Respect LLVM_VERSION when building windows container (#184231)

Otherwise setting LLVM_VERSION does not actually do anything. This
avoids needing to update ~8 different locations in the file when doing a
toolchain bump to just 1 place.
DeltaFile
+5-5.github/workflows/containers/github-action-ci-windows/Dockerfile
+5-51 files

LLVM/project 52f32d7.github/workflows/containers/github-action-ci Dockerfile, .github/workflows/containers/github-action-ci-windows Dockerfile

[Github] Bump Github Runner to v2.332.0 (#184230)

To stay ahead of the support horizon. There were no major feature
changes/bug fixes from a cursory glance at the release notes.
DeltaFile
+1-1.github/workflows/containers/github-action-ci-windows/Dockerfile
+1-1.github/workflows/containers/github-action-ci/Dockerfile
+2-22 files

LLVM/project 8decfb8mlir/lib/Conversion/ArithToEmitC ArithToEmitCPass.cpp, mlir/lib/Conversion/FuncToEmitC FuncToEmitCPass.cpp

[mlir][emitc] Do not convert illegal types to emitc (#156222)

This PR adds fallbacks for other types instead of converting unsupported
types to emitc.
DeltaFile
+8-0mlir/test/Conversion/ArithToEmitC/arith-to-emitc-failed.mlir
+6-1mlir/lib/Conversion/ArithToEmitC/ArithToEmitCPass.cpp
+6-1mlir/lib/Conversion/FuncToEmitC/FuncToEmitCPass.cpp
+6-0mlir/test/Conversion/FuncToEmitC/func-to-emitc-failed.mlir
+26-24 files

LLVM/project 2407564clang/include/clang/Basic OpenCLExtensions.def, clang/test/SemaOpenCL extension-version.cl

[Clang] Add missing extension cl_intel_split_work_group_barrier declaration (#184269)

All the OpenCL extensions must be declared in OpenCLExtensions.def,
otherwise the frontend won't recognize them and won't be able to use
them in the code. This patch adds the missing declaration for the
`cl_intel_split_work_group_barrier` extension.
DeltaFile
+12-0clang/test/SemaOpenCL/extension-version.cl
+1-0clang/include/clang/Basic/OpenCLExtensions.def
+13-02 files