LLVM/project 24b8613llvm/lib/Target/M68k M68kCollapseMOVEMPass.cpp, llvm/test/CodeGen/M68k CollapseMOVEM.mir

[M68k] Fix MOVEM collapse pass for 2 instances of same register (#174349)

Add test case for MOVEM collapse opt pass failure and fix pass handling
of 2 appearances of the same register in a MOVEM block.
DeltaFile
+13-0llvm/test/CodeGen/M68k/CollapseMOVEM.mir
+5-0llvm/lib/Target/M68k/M68kCollapseMOVEMPass.cpp
+18-02 files

LLVM/project e81befaclang/cmake/caches VectorEngine.cmake

Remove abuse of OPENMP_STANDALONE_BUILD
DeltaFile
+0-8clang/cmake/caches/VectorEngine.cmake
+0-81 files

LLVM/project 48acfa9openmp/runtime/unittests CMakeLists.txt

Remove standalone gtest handling
DeltaFile
+1-20openmp/runtime/unittests/CMakeLists.txt
+1-201 files

LLVM/project 5bcfe6ecompiler-rt/lib/sanitizer_common/tests sanitizer_procmaps_mac_test.cpp

[Sanitizers] Remove unused variable (#177061)

Must've remained from debugging the test case.

rdar://119958411

Co-authored-by: Mariusz Borsa <m_borsa at apple.com>
DeltaFile
+0-1compiler-rt/lib/sanitizer_common/tests/sanitizer_procmaps_mac_test.cpp
+0-11 files

LLVM/project 44e71feopenmp/cmake OpenMPTesting.cmake

LLVM_RUNTIME_OUTPUT_INTDIR -> LLVM_TOOLS_BINARY_DIR
DeltaFile
+2-2openmp/cmake/OpenMPTesting.cmake
+2-21 files

LLVM/project 4699334llvm/utils/TableGen/Common CodeGenRegisters.cpp

[TableGen] Prefer base class on tied RC sizes

When searching for a matching subclass tablegen behavior is non
deterministic if we have several classes with the same size.
Break the tie by chooisng a class with smaller BaseClassOrder.
DeltaFile
+7-2llvm/utils/TableGen/Common/CodeGenRegisters.cpp
+7-21 files

LLVM/project 688a884clang/unittests/ASTMatchers ASTMatchersNarrowingTest.cpp ASTMatchersNodeTest.cpp

[clang][test] Specify value of `-fopenmp=libomp` for tests. (#177239)

`libomp` is the default value when unconfigured in cmake, but llvm can
be configured to have `libgomp` be the default instead. Explicitly
specify this value so the test does not fail when it assumes libomp is
always the default.

Fix for f369d23ceaa49ffa9e6ef9673851749d66b35b3f (#150580)
DeltaFile
+8-4clang/unittests/ASTMatchers/ASTMatchersNarrowingTest.cpp
+4-2clang/unittests/ASTMatchers/ASTMatchersNodeTest.cpp
+12-62 files

LLVM/project 084916aclang/docs ReleaseNotes.rst, clang/include/clang/Basic DiagnosticGroups.td LangOptions.def

[LifetimeSafety] Remove "experimental-" prefix from flags and diagnostics (#176821)

Remove the "experimental-" prefix from lifetime safety diagnostic groups
and command-line options. This enables the analysis in `-Wall`.

We are now in a pretty stable state with no crashes. This change
indicates that lifetime safety analysis is no longer considered
experimental and is now a stable feature. By removing the
"experimental-" prefix, we're signaling to users that this functionality
is ready for use.

- Renamed diagnostic groups from `experimental-lifetime-safety*` to
`lifetime-safety*`
- Updated command-line options from `-fexperimental-lifetime-safety*` to
`-flifetime-safety*` and this is now ON by default.
- Added a check to only enable lifetime safety analysis when relevant
diagnostics are enabled
- Updated test files to use the new flag names
DeltaFile
+27-0clang/docs/ReleaseNotes.rst
+11-1clang/lib/Sema/AnalysisBasedWarnings.cpp
+6-6clang/include/clang/Basic/DiagnosticGroups.td
+3-3clang/include/clang/Options/Options.td
+3-3clang/include/clang/Basic/LangOptions.def
+2-2clang/test/Sema/warn-lifetime-safety.cpp
+52-155 files not shown
+58-2211 files

LLVM/project 5e4f8d7clang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/test/CIR/CodeGenBuiltins/X86 avx512f-builtins.c avx512vl-builtins.c

[CIR][X86] Add support for shuff32x4/shufi32x4 builtins (#172960)

This implementation is adapted from the existing code for
`X86::BI__builtin_ia32_shuf_i*` and `X86::BI__builtin_ia32_shuf_f*` from
`/llvm-project/clang/lib/CodeGen/TargetBuiltins/X86.cpp`.

It adds support for the following X86 builtins:
- __builtin_ia32_shuf_f32x4
- __builtin_ia32_shuf_f64x2
- __builtin_ia32_shuf_i32x4
- __builtin_ia32_shuf_i64x2
- __builtin_ia32_shuf_f32x4_256
- __builtin_ia32_shuf_f64x2_256
- __builtin_ia32_shuf_i32x4_256
- __builtin_ia32_shuf_i64x2_256

Part of https://github.com/llvm/llvm-project/issues/167765
DeltaFile
+53-0clang/test/CIR/CodeGenBuiltins/X86/avx512f-builtins.c
+52-0clang/test/CIR/CodeGenBuiltins/X86/avx512vl-builtins.c
+28-1clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+133-13 files

LLVM/project 1faa0dallvm/utils/TableGen/Common CodeGenRegisters.cpp

[TableGen] Prefer base class on tied RC sizes

When searching for a matching subclass tablegen behavior is non
deterministic if we have several classes with the same size.
Break the tie by chooisng a class with smaller BaseClassOrder.
DeltaFile
+7-6llvm/utils/TableGen/Common/CodeGenRegisters.cpp
+7-61 files

LLVM/project c3c22c9lldb/test/API/functionalities/ubsan/basic TestUbsanBasic.py

[lldb] Skip TestUbsanBasic (#177263)

The fix for this is being discussed in
https://github.com/llvm/llvm-project/issues/177064. Skip the test for
now to get the bots green.
DeltaFile
+1-0lldb/test/API/functionalities/ubsan/basic/TestUbsanBasic.py
+1-01 files

LLVM/project e2b0cf5openmp CMakeLists.txt

Literal string
DeltaFile
+2-2openmp/CMakeLists.txt
+2-21 files

LLVM/project 643b31aclang/docs ReleaseNotes.rst, clang/lib/Sema SemaStmt.cpp

[clang] Fix lifetime extension of temporaries in for-range-initializers in templates (#177191)

Fixes https://github.com/llvm/llvm-project/issues/165182.

This patch fix the lifetime extension of temporaries in
for-range-initializers in templates. Whether this issue was occurred
when the for-range statement in a dependent context, but itself is not
type/value dependent.

---------

Signed-off-by: Wang, Yihan <yronglin777 at gmail.com>
DeltaFile
+34-0clang/test/CXX/special/class.temporary/p6.cpp
+7-8clang/lib/Sema/SemaStmt.cpp
+2-0clang/docs/ReleaseNotes.rst
+43-83 files

LLVM/project a6e332ellvm/test/CodeGen/RISCV clmul.ll, llvm/test/CodeGen/X86 clmul-vector.ll clmul.ll

Merge branch 'main' into users/meinersbur/openmp_remove-standalone-build
DeltaFile
+12,546-0llvm/test/CodeGen/RISCV/clmul.ll
+6,126-5,116llvm/test/CodeGen/X86/clmul-vector.ll
+4,065-1,302llvm/test/MC/AMDGPU/gfx11_asm_vop3_dpp8.s
+3,137-2,053mlir/utils/vscode/package-lock.json
+2,373-2,733llvm/test/CodeGen/X86/clmul.ll
+0-4,569llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp8.txt
+28,247-15,7732,647 files not shown
+152,524-79,5522,653 files

LLVM/project d80ba5ellvm/lib/Target/AArch64 AArch64SchedNeoverseN2.td, llvm/test/tools/llvm-mca/AArch64/Neoverse N2-forwarding.s N2-basic-instructions.s

[AArch64] Model Neoverse N2 late forwarding (#176331)

This patch models late forwarding for N2 as per the [N2
SWOG](https://developer.arm.com/documentation/109914/latest/).
DeltaFile
+1,961-0llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-forwarding.s
+174-54llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
+81-81llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-basic-instructions.s
+18-18llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-sve-instructions.s
+2,234-1534 files

LLVM/project 2692f5ellvm/lib/Target/AMDGPU GCNSubtarget.h AMDGPU.td

[NFCI][AMDGPU] Convert more `SubtargetFeatures` to use `AMDGPUSubtargetFeature` and X-macros (#177256)

Extend the X-macro pattern to eliminate boilerplate for additional
subtarget features.

This reduces ~50 lines of repetitive member declarations and getter
definitions.
DeltaFile
+123-140llvm/lib/Target/AMDGPU/GCNSubtarget.h
+83-146llvm/lib/Target/AMDGPU/AMDGPU.td
+3-3llvm/lib/Target/AMDGPU/R600Subtarget.h
+2-2llvm/lib/Target/AMDGPU/AMDGPUFeatures.td
+1-1llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+212-2925 files

LLVM/project 8eac375clang/test/CodeGenObjC arc-foreach.m arc-unsafeclaim.m, clang/test/CodeGenObjCXX auto-release-result-assert.mm

Revert "[CGObjC] Allow clang.arc.attachedcall on -O0 (#164875)"

This reverts commit 5c29b64fda6a5a66e09378eec9f28a42066a7c6a.

This was causing failures at HEAD on x86-64 Linux.
DeltaFile
+0-231llvm/test/CodeGen/AArch64/call-rv-marker.ll
+89-89clang/test/CodeGenObjC/arc-foreach.m
+5-45clang/test/CodeGenObjC/arc-unsafeclaim.m
+16-16clang/test/CodeGenObjC/os_log.m
+1-22clang/test/CodeGenObjC/arc-arm.m
+6-12clang/test/CodeGenObjCXX/auto-release-result-assert.mm
+117-41510 files not shown
+155-46216 files

LLVM/project b887b52llvm/lib/Transforms/Instrumentation MemorySanitizer.cpp, llvm/test/Instrumentation/MemorySanitizer/AArch64 arm64-vcvt.ll arm64-vcvt_n.ll

[msan] Handle aarch64_neon_vcvt* (#177243)

This fills in missing gaps in MSan's AArch64 NEON vector conversion
intrinsic handling (intrinsics named aarch64_neon_vcvt* instead of
aarch64_neon_fcvt*). SVE support sold separately.

It also generalizes handleNEONVectorConvertIntrinsic to handle
conversions to/from fixed-point.
DeltaFile
+39-101llvm/test/Instrumentation/MemorySanitizer/AArch64/arm64-vcvt.ll
+19-58llvm/test/Instrumentation/MemorySanitizer/AArch64/arm64-vcvt_n.ll
+13-38llvm/test/Instrumentation/MemorySanitizer/AArch64/arm64-vcvt_f32_su32.ll
+34-6llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp
+105-2034 files

LLVM/project 5c29b64clang/test/CodeGenObjC arc-foreach.m arc-unsafeclaim.m, clang/test/CodeGenObjCXX auto-release-result-assert.mm

[CGObjC] Allow clang.arc.attachedcall on -O0 (#164875)

It is supported in GlobalISel there. On X86, we always kick to
SelectionDAG anyway, so there is no point in not doing it for X86 too.

I do not have merge permissions.
DeltaFile
+231-0llvm/test/CodeGen/AArch64/call-rv-marker.ll
+89-89clang/test/CodeGenObjC/arc-foreach.m
+45-5clang/test/CodeGenObjC/arc-unsafeclaim.m
+16-16clang/test/CodeGenObjC/os_log.m
+22-1clang/test/CodeGenObjC/arc-arm.m
+12-6clang/test/CodeGenObjCXX/auto-release-result-assert.mm
+415-11710 files not shown
+462-15516 files

LLVM/project 3beb520llvm/lib/Transforms/Vectorize VPlanUtils.cpp

[VPlan] Support VPWidenPointerInduction in getSCEVExprForVPValue (NFCI)

Support VPWidenPointerInductionRecipe in getSCEVExprForVPValue.

This is used in code paths when computing SCEV expressions in the
VPlan-based cost model, which should produce costs matching the legacy
cost model.
DeltaFile
+12-0llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+12-01 files

LLVM/project c65b032llvm/lib/Target/AMDGPU GCNSubtarget.h AMDGPU.td

[NFCI][AMDGPU] Convert more `SubtargetFeatures` to use `AMDGPUSubtargetFeature` and X-macros

Extend the X-macro pattern to eliminate boilerplate for additional subtarget features.

This reduces ~50 lines of repetitive member declarations and getter definitions.
DeltaFile
+123-140llvm/lib/Target/AMDGPU/GCNSubtarget.h
+83-146llvm/lib/Target/AMDGPU/AMDGPU.td
+3-3llvm/lib/Target/AMDGPU/R600Subtarget.h
+2-2llvm/lib/Target/AMDGPU/AMDGPUFeatures.td
+1-1llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+212-2925 files

LLVM/project 83b13e6llvm/lib/Transforms/Vectorize VPlanUtils.cpp

[VPLan] Update formatting in getSCEVExprForVPValue (NFC).

Reformat TypeSwitch in getSCEVExprForVPValue, to reduce diff in
follow-up changes.
DeltaFile
+46-42llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+46-421 files

LLVM/project 06709f9mlir/lib/Dialect/MemRef/Transforms FoldMemRefAliasOps.cpp, mlir/test/Dialect/MemRef fold-memref-alias-ops.mlir

[mlir][MemRef] Make fold-memref-alias-ops use memref interfaces

This replaces the large switch-cases and operation-specific patterns
in FoldMemRefAliashops with patterns that use the new
IndexedAccessOpInterface and IndexedMemCopyOpInterface, which will
allow us to remove the memref transforms' dependency on the NVGPU
dialect.

This does also resolve some bugs and potential unsoundnesses:
1. We will no longer fold in expand_shape into vector.load or
vector.transfer_read in cases where that would alter the strides
between dimensions in multi-dimensional loads. For example, if we have
a `vector.load %e[%i, %j, %k] : memref<8x8x9xf32>, vector<2x3xf32>`
where %e is
`expand_shape %m [[0], [1], [2. 3]] : memref<8x8x3x3xf32> to 8x8x9xf32,
we will no longer fold in that shape, since that would change which
value would be read (the previous patterns tried to account for this
but failed).
2. Subviews that have non-unit strides in positions that aren't being

    [15 lines not shown]
DeltaFile
+401-419mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
+292-1mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
+693-4202 files

LLVM/project e1d76e7mlir/include/mlir/Interfaces VectorInterfaces.td VectorInterfaces.h

[mlir] Add [may]updateStartingPosition to VectorTransferOpInterface

This commit adds methods to VectorTransferOpInterface that allow
transfer operations to be queried for whether their base memref (or
tensor) and permutation map can be updated in some particular way and
then for performing this update. This is part of a series of changes
designed to make passes like fold-memref-alias-ops more generic,
allowing downstream operations, like IREE's transfer_gather, to
participate in them without needing to duplicate patterns.
DeltaFile
+67-1mlir/include/mlir/Interfaces/VectorInterfaces.td
+1-0mlir/include/mlir/Interfaces/VectorInterfaces.h
+68-12 files

LLVM/project 135f4d1mlir/include/mlir/Dialect/MemRef/IR MemRefOps.td, mlir/include/mlir/Dialect/NVGPU/IR NVGPUOps.td

[mlir] Implement indexed access op interfaces for memref, vector, gpu, nvgpu

This commit implements the IndexedAccessOpInterface and
IndexedMemCopyInterface for all operations in the memref and vector
dialects that it would appear to apply to. It follows the code in
FoldMemRefAliasOps and ExtractAddressComputations to define the
interface implementations. This commit also adds the interface to the
GPU subgroup MMA load and store operations and to any NVGPU operations
currently being handled by the in-memref transformations (there may be
more suitable operations in the NVGPU dialect, but I haven't gone
looking systematically)

This code will be tested by a later commit that updates
fold-memref-alias-ops.

Assisted-by: Claude Code, Cursor (interface boilerplate, sketching out
implementations)
DeltaFile
+162-0mlir/lib/Dialect/Vector/Transforms/IndexedAccessOpInterfaceImpl.cpp
+66-64mlir/include/mlir/Dialect/NVGPU/IR/NVGPUOps.td
+115-0mlir/lib/Dialect/GPU/Transforms/IndexedAccessOpInterfaceImpl.cpp
+81-18mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
+90-0mlir/lib/Dialect/NVGPU/Transforms/MemoryAccessOpInterfacesImpl.cpp
+36-8mlir/include/mlir/Dialect/MemRef/IR/MemRefOps.td
+550-9015 files not shown
+678-9621 files

LLVM/project 514e573mlir/include/mlir/Dialect/MemRef/IR MemoryAccessOpInterfaces.td MemoryAccessOpInterfaces.h, mlir/lib/Dialect/MemRef/IR MemoryAccessOpInterfaces.cpp CMakeLists.txt

[mlir][memref] Define interfaces for ops that access memrefs at an index

This commit defines interfaces for operations that perform certain
kinds of indexed access on a memref. These interfaces are defined so
that passes like fold-memref-alias-ops and the memref flattener can be
made generic over operations that, informally, have the forms
`op ... %m[%i0, %i1, ...] ...` (an IndexedAccessOpInterface) or the
form `op %src[%s0, %s1, ...], %dst[%d0, %d1, ...] size ...` (an
IndexedMemCopyOpInterface).

These interfaces have been designed such that all the passes under
MemRef/Transforms that currently have a big switch-case on
memref.load, vector.load, nvgpu.ldmatrix, etc. can be migrated to use
them.

(This'll also let us get rid of the awkward fact that we have memref
transforms depending on the GPU and NVGPU dialects)

While the interface doesn't currently contemplate changing element

    [6 lines not shown]
DeltaFile
+200-0mlir/include/mlir/Dialect/MemRef/IR/MemoryAccessOpInterfaces.td
+64-0mlir/lib/Dialect/MemRef/IR/MemoryAccessOpInterfaces.cpp
+32-0mlir/include/mlir/Dialect/MemRef/IR/MemoryAccessOpInterfaces.h
+2-0mlir/lib/Dialect/MemRef/IR/CMakeLists.txt
+1-0mlir/include/mlir/Dialect/MemRef/IR/CMakeLists.txt
+299-05 files

LLVM/project a12adfbllvm/lib/Target/AMDGPU GCNSubtarget.h AMDGPU.td

[NFCI][AMDGPU] Convert more `SubtargetFeatures` to use `AMDGPUSubtargetFeature` and X-macros

Extend the X-macro pattern to eliminate boilerplate for additional subtarget features.

This reduces ~50 lines of repetitive member declarations and getter definitions.
DeltaFile
+123-140llvm/lib/Target/AMDGPU/GCNSubtarget.h
+100-154llvm/lib/Target/AMDGPU/AMDGPU.td
+3-3llvm/lib/Target/AMDGPU/R600Subtarget.h
+2-2llvm/lib/Target/AMDGPU/AMDGPUFeatures.td
+1-1llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+229-3005 files

LLVM/project 6367c4bcmake/Modules HandleDoxygen.cmake

[cmake][NFC] CRLF -> LF
DeltaFile
+40-40cmake/Modules/HandleDoxygen.cmake
+40-401 files

LLVM/project dd363d0llvm/lib/Transforms/Vectorize VPlanUnroll.cpp VPlan.h

[VPlan] Replace UnrollPart for VPScalarIVSteps with start index op (NFC) (#170906)

Replace the unroll part operand for VPScalarIVStepsRecipe with the start
index. This simplifies https://github.com/llvm/llvm-project/pull/170053
and is also a first step to break down the recipe into its components.

PR: https://github.com/llvm/llvm-project/pull/170906
DeltaFile
+36-5llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+17-7llvm/lib/Transforms/Vectorize/VPlan.h
+3-19llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+11-0llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+3-3llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+70-345 files

LLVM/project 8aa83e9llvm/utils/TableGen/Common CodeGenRegisters.cpp

[TableGen] Prefer base class on tied RC sizes

When searching for a matching subclass tablegen behavior is non
deterministic if we have several classes with the same size.
Break the tie by chooisng a class with smaller BaseClassOrder.
DeltaFile
+4-1llvm/utils/TableGen/Common/CodeGenRegisters.cpp
+4-11 files