LLVM/project c4e6cf0flang/include/flang/Evaluate tools.h, flang/lib/Optimizer/Transforms/CUDA CUFAddConstructor.cpp CUFOpConversionLate.cpp

[flang][cuda] Support non-allocatable module-level managed variables (#188526)

Add support for non-allocatable module-level CUDA managed variables
using pointer indirection through a companion global in
__nv_managed_data__. The CUDA runtime populates this pointer with the
unified memory address via __cudaRegisterManagedVar and
__cudaInitModule.

1. Create a .managed.ptr companion global in the __nv_managed_data__
section and register it with _FortranACUFRegisterManagedVariable
(CUFAddConstructor.cpp)
2. Call __cudaInitModule after registration to populate the managed
pointer (registration.cpp)
3. Annotate managed globals in gpu.module with nvvm.managed for PTX
.attribute(.managed) generation (cuda-code-gen.mlir)
4. Suppress cuf.data_transfer for assignments to/from non-allocatable
module managed variables, since cudaMemcpy would target the shadow
address rather than the actual unified memory (tools.h)
5. Preserve cuf.data_transfer for device_var = managed_var assignments
where explicit transfer is still required
DeltaFile
+70-14flang/lib/Optimizer/Transforms/CUDA/CUFAddConstructor.cpp
+39-0flang/test/Fir/CUDA/cuda-device-address.mlir
+36-1flang/test/Fir/CUDA/cuda-constructor-2.f90
+31-5flang/include/flang/Evaluate/tools.h
+36-0flang/test/Lower/CUDA/cuda-data-transfer.cuf
+20-2flang/lib/Optimizer/Transforms/CUDA/CUFOpConversionLate.cpp
+232-227 files not shown
+293-2313 files

LLVM/project 0e0a045utils/bazel/llvm-project-overlay/clang BUILD.bazel

[Bazel] Port d08ebbe8eba1dd2453a00ae0a0a6bd04b0d88d51
DeltaFile
+3-0utils/bazel/llvm-project-overlay/clang/BUILD.bazel
+3-01 files

LLVM/project d471902clang/lib/CIR/CodeGen CIRGenItaniumCXXABI.cpp, clang/test/CIR/CodeGen try-catch.cpp

[CIR] Implement reference type of record ptr in initCatchParam (#185214)

Implement the reference type of record ptr in initCatchParam
DeltaFile
+128-0clang/test/CIR/CodeGen/try-catch.cpp
+20-3clang/lib/CIR/CodeGen/CIRGenItaniumCXXABI.cpp
+148-32 files

LLVM/project 65720admlir/lib/Dialect/GPU/Pipelines GPUToXeVMPipeline.cpp, mlir/lib/Dialect/XeGPU/Transforms XeGPUSgToWiDistributeExperimental.cpp

[MLIR][XeGPU] Switch to the new sg to wi pass (#188627)

This PR has changes required to switch the pipeline to use the new sg to
wi pass.
DeltaFile
+21-23mlir/lib/Dialect/XeGPU/Utils/XeGPUUtils.cpp
+39-0mlir/test/Dialect/XeGPU/sg-to-wi-experimental-unit.mlir
+8-1mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToWiDistributeExperimental.cpp
+2-1mlir/lib/Dialect/GPU/Pipelines/GPUToXeVMPipeline.cpp
+70-254 files

LLVM/project 9e77a45mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp

[mlir][OpenMP][NFC] Refactor fillAffinityIteratorLoop (#189418)

Extract affinity-specific logic from fillAffinityIteratorLoop into a
callback so that the iterator loop codegen logic can be shared with
other clauses such as depend clause and target clause.
DeltaFile
+23-16mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+23-161 files

LLVM/project 85d7927flang/lib/Lower/OpenMP ClauseProcessor.cpp, flang/test/Lower/OpenMP depend-iterator.f90

[Flang][OpenMP] Support iterator modifier in depend clause (#189412)

Lower the iterator modifier on depend clause to omp.iterator. Iterated
depend objects emit `!omp.iterated<!llvm.ptr>` by using
`getDataOperandBaseAddr` to generate base address and
`genIteratorCoordinate` to get the addr+offset. The non-iterated objects
in depend clause still use existing lowering path.

This patch is part of feature work for #188061.

Assisted with copilot.
DeltaFile
+439-0flang/test/Lower/OpenMP/depend-iterator.f90
+63-14flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+0-10flang/test/Lower/OpenMP/Todo/depend-clause.f90
+502-243 files

LLVM/project 7ff0dc4mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

[mlir][OpenMP] Add iterator support to depend clause (#189090)

Extend the depend clause to support `!omp.iterated<Ty>` handles
alongside plain depend vars, so the IR can represent both forms.

Assisted with copilot

This is part of feature work for
https://github.com/llvm/llvm-project/issues/188061
DeltaFile
+107-58mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+35-2mlir/test/Dialect/OpenMP/ops.mlir
+30-0mlir/test/Target/LLVMIR/openmp-todo.mlir
+24-4mlir/test/Dialect/OpenMP/invalid.mlir
+11-5mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+7-0mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+214-693 files not shown
+223-729 files

LLVM/project fce3a66flang/lib/Semantics mod-file.cpp resolve-names.cpp, flang/test/Semantics modfile84.f90

[flang] Preserve UseErrorDetails in module files (#189423)

When the same name is USE-associated with two or more distinct ultimate
symbols, and they are not both generic procedure interfaces, it's not an
error unless the name is actually referenced in the scope. But when the
scope is itself a module or submodule, our module files don't preserve
the error for later diagnosis -- instead, the UseErrorDetails symbol
that serves as a "poison pill" in case of later use is discarded when
the module file is generated. So emit additional USE statements to the
module file so that a UseErrorDetails symbol is created anew when the
module file is read.
DeltaFile
+19-0flang/test/Semantics/modfile84.f90
+16-3flang/lib/Semantics/mod-file.cpp
+14-0flang/test/Semantics/Inputs/modfile84.f90
+1-1flang/lib/Semantics/resolve-names.cpp
+1-0flang/lib/Semantics/mod-file.h
+51-45 files

LLVM/project c98e438utils/bazel/llvm-project-overlay/mlir/test/mlir-tblgen BUILD.bazel

[Bazel] Port b544ad57039588d0fe24a1f512202cc5c0bd3a67
DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/test/mlir-tblgen/BUILD.bazel
+1-01 files

LLVM/project 728de26llvm/test/CodeGen/AArch64 neon-abd.ll

[NFC][AArch64] neon-abd.ll - remove unnecessary entry labels to reduce diff size in #186659 (#189690)
DeltaFile
+8-16llvm/test/CodeGen/AArch64/neon-abd.ll
+8-161 files

LLVM/project 6477f3amlir/lib/Conversion/ArithToSPIRV ArithToSPIRV.cpp, mlir/test/Conversion/ArithToSPIRV arith-to-spirv.mlir

[mlir][ArithToSPIRV] Fix invalid SPIRV and crashes when lowering integer ops on i1 (#189239)

Several arith integer operations on i1 / vector<Ni1> types were either
crashing or producing invalid SPIRV. The i1 type maps to spirv.bool in
SPIRV, not to a SPIRV integer — so standard integer SPIRV ops
(spirv.IAdd, spirv.UDiv, spirv.GLSMax, etc.) are illegal on it.
Add dedicated boolean patterns for all affected arith integer ops, each
with benefit=2 to take priority over the generic elementwise patterns.
The semantics for i1 follow from treating true = 1 / false = 0 with
two's complement wrapping:
   
- addi, subi → spirv.LogicalNotEqual (XOR on bits)
- muli, divui, divsi → spirv.LogicalAnd                                
- remui, remsi, shli, shrui → spirv.LogicalAnd(a, spirv.LogicalNot(b))
(a & ~b)
- shrsi → identity (arithmetic right shift of a 1-bit signed value is
always the input)
- maxui, minsi → spirv.LogicalOr (unsigned max / signed min treats true
as larger)

    [4 lines not shown]
DeltaFile
+111-0mlir/lib/Conversion/ArithToSPIRV/ArithToSPIRV.cpp
+79-0mlir/test/Conversion/ArithToSPIRV/arith-to-spirv.mlir
+190-02 files

LLVM/project 4087c5flldb/examples/python formatter_bytecode.py

[lldb][bytecode] Add append mode for compiler output (#189693)
DeltaFile
+9-1lldb/examples/python/formatter_bytecode.py
+9-11 files

LLVM/project 5f23024llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Address comment

Created using spr 1.3.7
DeltaFile
+1-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+1-11 files

LLVM/project 720b8eallvm CMakeLists.txt, llvm/runtimes CMakeLists.txt

[openmp] Add support for Arm64X to libomp (#176157)

This patch allows building libomp.dll and libomp.lib as Arm64X binaries
containing both arm64 and arm64ec code and useable from applications
compiled for both architectures.
DeltaFile
+98-0openmp/runtime/cmake/arm64x.cmake
+16-0llvm/CMakeLists.txt
+10-0openmp/runtime/CMakeLists.txt
+6-0llvm/runtimes/CMakeLists.txt
+130-04 files

LLVM/project 007b49cllvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/test/MC/AArch64 armv9a-tlbip.s

fixup! More PR cleanups following comments
DeltaFile
+7-11llvm/lib/Target/AArch64/AArch64SystemOperands.td
+0-4llvm/test/MC/AArch64/armv9a-tlbip.s
+7-152 files

LLVM/project 0773346llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp, llvm/lib/Target/AArch64/Utils AArch64BaseInfo.h

fixup! More simplification
DeltaFile
+413-443llvm/test/MC/AArch64/armv9a-tlbip.s
+1-15llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+7-9llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+421-4673 files

LLVM/project 56d4c24llvm/test/MC/AArch64 armv9a-tlbip.s armv9a-tlbip-tlbid.s

fixup! Split armv9a-tlbip.s into two test files
DeltaFile
+0-823llvm/test/MC/AArch64/armv9a-tlbip.s
+413-0llvm/test/MC/AArch64/armv9a-tlbip-tlbid.s
+351-0llvm/test/MC/AArch64/armv9a-tlbip-d128.s
+764-8233 files

LLVM/project c0b77b8llvm/test/MC/AArch64 armv9a-tlbip.s

fixup! Optimise RUN lines in armv9a-tlbip.s
DeltaFile
+58-114llvm/test/MC/AArch64/armv9a-tlbip.s
+58-1141 files

LLVM/project 04a3b0fllvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/lib/Target/AArch64/Utils AArch64BaseInfo.h

fixup! Fix commits after rebase to main
DeltaFile
+19-29llvm/lib/Target/AArch64/AArch64SystemOperands.td
+5-6llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+6-0llvm/test/MC/AArch64/armv9a-tlbip.s
+30-353 files

LLVM/project a572e60llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! Simplify logic after suggestions from Marian
DeltaFile
+13-10llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+13-101 files

LLVM/project ce277ballvm/test/MC/AArch64 tlbip-tlbid-or-d128.s armv9a-tlbip.s

fixup! Fix using Marian's suggestion
DeltaFile
+0-259llvm/test/MC/AArch64/tlbip-tlbid-or-d128.s
+160-0llvm/test/MC/AArch64/armv9a-tlbip.s
+160-2592 files

LLVM/project 916b509llvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! More optimisations
DeltaFile
+10-11llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+7-6llvm/lib/Target/AArch64/AArch64SystemOperands.td
+1-8llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+18-253 files

LLVM/project 41d452dllvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

fixup! Don't use ExtraRequires. Instead, set a boolean in TLBITableBase
DeltaFile
+27-22llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+17-1llvm/lib/Target/AArch64/AArch64SystemOperands.td
+7-7llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+51-303 files

LLVM/project 2b9ebc2llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp, llvm/lib/Target/AArch64/Utils AArch64BaseInfo.h

[AArch64][llvm] Gate some `tlbip` insns with +tlbid or +d128

Change the gating of `tlbip` instructions containing `*E1IS*`, `*E1OS*`,
`*E2IS*` or `*E2OS*` to be used with `+tlbid` or `+d128`. This is because
the 2025 Armv9.7-A MemSys specification says:

```
  All TLBIP *E1IS*, TLBIP*E1OS*, TLBIP*E2IS* and TLBIP*E2OS* instructions
  that are currently dependent on FEAT_D128 are updated to be dependent
  on FEAT_D128 or FEAT_TLBID
```
DeltaFile
+259-0llvm/test/MC/AArch64/tlbip-tlbid-or-d128.s
+66-66llvm/test/MC/AArch64/armv9a-tlbip.s
+15-5llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+20-0llvm/lib/Target/AArch64/Utils/AArch64BaseInfo.h
+360-714 files

LLVM/project 1f7a67fllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 fma-conversion-multi-use-guard.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+6-14llvm/test/Transforms/SLPVectorizer/AArch64/fma-conversion-multi-use-guard.ll
+16-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+22-142 files

LLVM/project b66d98alldb/examples/python formatter_bytecode.py, lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode RigidArrayLLDBFormatterSwift.txt

[lldb][bytecode] Improvements to compiler generated Swift (#189425)

Following feedback from @benrimmington in
https://github.com/apple/swift-collections/pull/607, this changes the
following:

1. Uses `objectFormat()` compiler conditional instead of `os()` (see
"Cross-platform object file format support" in
[SE-0492](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0492-section-control.md#cross-platform-object-file-format-support))
2. Uses a raw identifier for the generated Swift symbol name, instead of
an escaped name (see
[SE-0451](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0451-escaped-identifiers.md))
DeltaFile
+4-8lldb/examples/python/formatter_bytecode.py
+2-2lldb/test/Shell/ScriptInterpreter/Python/Inputs/FormatterBytecode/RigidArrayLLDBFormatterSwift.txt
+6-102 files

LLVM/project 67c3429mlir/docs Interfaces.md, mlir/docs/Tools mlir-reduce.md

[mlir][docs] dialect interfaces and mlir reduce documentation fix (#189258)

Two modifications:

1. Reflect newly added dialect interface methods in the documentation
2. Remove the bug in the `MLIR Reduce` documentation
DeltaFile
+24-2mlir/docs/Interfaces.md
+3-0mlir/docs/Tools/mlir-reduce.md
+27-22 files

LLVM/project e891812llvm/lib/CodeGen ExpandVectorPredication.cpp, llvm/lib/Target/RISCV RISCVISelLowering.cpp

[RISCV] Remove codegen for vp_minimum, vp_maximum (#189550)

Part of the work to remove trivial VP intrinsics from the RISC-V
backend, see
https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999

This splits off two intrinsics from #179622.
DeltaFile
+462-654llvm/test/CodeGen/RISCV/rvv/fminimum-vp.ll
+462-654llvm/test/CodeGen/RISCV/rvv/fmaximum-vp.ll
+221-269llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fminimum-vp.ll
+221-269llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fmaximum-vp.ll
+3-22llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+3-1llvm/lib/CodeGen/ExpandVectorPredication.cpp
+1,372-1,8691 files not shown
+1,372-1,8717 files

LLVM/project 38c0f53llvm/test/Transforms/SLPVectorizer/AArch64 fma-conversion-multi-use-guard.ll

[SLP][NFC] Add a test for incorrect fma-conversion for fmuls with multi uses
DeltaFile
+153-0llvm/test/Transforms/SLPVectorizer/AArch64/fma-conversion-multi-use-guard.ll
+153-01 files

LLVM/project ff4e229llvm/lib/Transforms/Vectorize VPlanRecipes.cpp VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize/RISCV riscv-vector-reverse.ll tail-folding-reverse-load-store.ll

Revert "[VPlan] Extract reverse mask from reverse accesses" (#189637)

Reverts llvm/llvm-project#155579

Assertion added triggers on some buildbots
clang:
/home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/llvm/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:3840:
virtual InstructionCost
llvm::VPWidenMemoryRecipe::computeCost(ElementCount, VPCostContext &)
const: Assertion `!IsReverse() && "Inconsecutive memory access should
not have reverse order"' failed.
PLEASE submit a bug report to
https://github.com/llvm/llvm-project/issues/ and include the crash
backtrace, preprocessed source, and associated run script.
Stack dump:
0. Program arguments:
/home/tcwg-buildbot/worker/clang-aarch64-sve2-vla/stage1.install/bin/clang
-DNDEBUG -mcpu=neoverse-v2 -mllvm -scalable-vectorization=preferred -O3
-std=gnu17 -fcommon -Wno-error=incompatible-pointer-types -MD -MT

    [3 lines not shown]
DeltaFile
+42-50llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+30-34llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+27-12llvm/lib/Transforms/Vectorize/VPlan.h
+21-18llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+8-8llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
+8-6llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reverse-load-store.ll
+136-1288 files not shown
+160-14914 files