LLVM/project 5937d7fclang/lib/Sema SemaLambda.cpp, clang/test/Modules lambda-convertablity.cppm

[C++20] [Modules] Correct the behavior for adding mangling for lambda in modules

Close https://github.com/llvm/llvm-project/issues/130080
Close https://github.com/llvm/llvm-project/issues/116087

The common pattern of the problem is the lambda is somehow leaked from a
non-inline non-internal function in module purview, and we need to
define a mangling for it by the discuss from
https://github.com/itanium-cxx-abi/cxx-abi/issues/186

The root cause of the issue is a mismatch that give up too quickly when
ManglingContextDecl is nullptr. But we can still get the context
information from DC.
DeltaFile
+23-0clang/test/Modules/lambda-convertablity.cppm
+6-5clang/lib/Sema/SemaLambda.cpp
+29-52 files

LLVM/project 365cb87mlir/python CMakeLists.txt

try fix windows badcast
DeltaFile
+4-0mlir/python/CMakeLists.txt
+4-01 files

LLVM/project 0c2f9b4llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 switch-cases-to-branch-and.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6-beta.1
DeltaFile
+40-26llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+16-24llvm/test/CodeGen/AArch64/switch-cases-to-branch-and.ll
+56-502 files

LLVM/project 693756emlir/python CMakeLists.txt

try fix windows badcast
DeltaFile
+4-0mlir/python/CMakeLists.txt
+4-01 files

LLVM/project b290a3emlir/lib/Dialect/Linalg/Transforms Specialize.cpp, mlir/lib/Dialect/Linalg/Utils Utils.cpp

[Linalg] Add *Pooling* matchers (#172351)

-- This commit is the eighth in the series of adding matchers
for linalg.*conv*/*pool*. Refer:
https://github.com/llvm/llvm-project/pull/163724
-- In this commit all variants of Pooling ops have been added.

Signed-off-by: Abhishek Varma <abhvarma at amd.com>
DeltaFile
+315-0mlir/lib/Dialect/Linalg/Utils/Utils.cpp
+157-0mlir/test/Dialect/Linalg/convolution/roundtrip-convolution.mlir
+12-0mlir/lib/Dialect/Linalg/Transforms/Specialize.cpp
+484-03 files

LLVM/project 94cb105llvm/unittests/ExecutionEngine/Orc ReOptimizeLayerTest.cpp

[ORC] Initialize the native target in ReOptimize unit test. (#172955)

The ReOptimize unit test was accidentally depending on native target
initialization in previous tests, causing it to be skipped if run on its
own (using --gtest_filter).

Calling OrcNativeTarget::initialize() in the SetUp method of the test
fixes this.
DeltaFile
+3-0llvm/unittests/ExecutionEngine/Orc/ReOptimizeLayerTest.cpp
+3-01 files

LLVM/project 32243a5mlir/docs/DefiningDialects Operations.md, mlir/include/mlir/IR EnumAttr.td

[MLIR] Add DefaultValuedEnumAttr decorator (#172916)

Introduce DefaultValuedEnumAttr, which similarly to DefaultValuedAttr
decorates an enum attribute to have a default value from a specific enum
case when not present. The default is constructed as the fully-qualified
enum case symbol.

In comparison to DefaultValuedAttr, this allows using a TableGen
EnumCase
variable instead of a raw string.
DeltaFile
+7-0mlir/test/mlir-tblgen/op-format.mlir
+6-0mlir/include/mlir/IR/EnumAttr.td
+5-0mlir/test/lib/Dialect/Test/TestOpsSyntax.td
+5-0mlir/docs/DefiningDialects/Operations.md
+23-04 files

LLVM/project 2c98c6elibcxx/docs/Status Cxx2cIssues.csv, libcxx/include shared_mutex

[libcxx] LWG4172 fix self-move-assignment in {unique|shared}_lock (#129542)

Fixes: https://github.com/llvm/llvm-project/issues/127861

---------

Co-authored-by: Louis Dionne <ldionne.2 at gmail.com>
DeltaFile
+32-11libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.unique/thread.lock.unique.cons/move_assign.pass.cpp
+25-11libcxx/test/std/thread/thread.mutex/thread.lock/thread.lock.shared/thread.lock.shared.cons/move_assign.pass.cpp
+8-7libcxx/include/__mutex/unique_lock.h
+3-8libcxx/include/shared_mutex
+1-1libcxx/docs/Status/Cxx2cIssues.csv
+69-385 files

LLVM/project d532641libclc/clc/lib/generic/shared clc_max.inc clc_min.inc

[libclc] Improve __clc_min/max/clamp implementation (#172599)

Replace __clc_max/min with __clc_fmax/fmin in __clc_clamp. FP
__clc_min/max/clamp now lowers to @llvm.minimumnum/@llvm.maximumnum, and
integer clamp lowers to @llvm.umin/@llvm.umax. This reduce fcmp+select
chains and improving codegen. Example change to amdgcn--amdhsa.bc:
```
in function _Z5clamphhh:
    >   %4 = icmp ugt i8 %0, %2
        %4 = tail call noundef i8 @llvm.umax.i8(i8 %0, i8 %1)
    >   %6 = select i1 %4, i8 %2, i8 %5
    >   ret i8 %6
    <   %5 = tail call noundef i8 @llvm.umin.i8(i8 %2, i8 %4)
    <   ret i8 %5
in function _Z5clampddd:
  in block %3 / %3:
    >   %4 = fcmp ogt double %0, %2
    >   %5 = fcmp olt double %0, %1
    >   %6 = select i1 %5, double %1, double %0

    [5 lines not shown]
DeltaFile
+19-1libclc/clc/lib/generic/shared/clc_max.inc
+18-1libclc/clc/lib/generic/shared/clc_min.inc
+2-3libclc/clc/lib/generic/shared/clc_clamp.inc
+2-0libclc/clc/lib/generic/shared/clc_clamp.cl
+1-0libclc/clc/lib/generic/shared/clc_min.cl
+1-0libclc/clc/lib/generic/shared/clc_max.cl
+43-56 files

LLVM/project 37a73d5llvm/lib/IR Verifier.cpp, llvm/test/Verifier memprof-metadata-good.ll memprof-metadata-bad.ll

[MemProf] Update metadata verification for a single string tag (#172543)

The memprof metadata verifier supported multiple string tags, but in
reality, the other code (e.g. addCallStack) only supports a single such
tag. Update the verifier to reflect that limitation, and the associated
tests.

Fixes #157217
DeltaFile
+4-11llvm/lib/IR/Verifier.cpp
+7-3llvm/test/Verifier/memprof-metadata-good.ll
+10-0llvm/test/Verifier/memprof-metadata-bad.ll
+21-143 files

LLVM/project 6c51c17llvm/tools/llvm-ir2vec llvm-ir2vec.cpp llvm-ir2vec.h

[NFC][llvm-ir2vec] llvm_ir2vec.cpp breakup to extract a reusable header for IR2VecTool, and MIR2VecTool classes (#172304)

Refactor llvm-ir2vec: Extract reusable header for Python bindings

Separated the IR2Vec/MIR2Vec tool implementation into a header file
(`llvm-ir2vec.h`) and implementation file (`llvm-ir2vec.cpp`) to enable
reuse in Python bindings and other projects.

Changes
- **Created `llvm-ir2vec.h`**: Contains `IR2VecTool` and `MIR2VecTool`
class definitions with all implementations, making it a standalone
header-only library
- **Simplified `llvm-ir2vec.cpp`**: Now contains only command-line
interface code (options, main function, and helper functions)

Motivation
The original monolithic `.cpp` file made it impossible to use
IR2Vec/MIR2Vec functionality in Python bindings without compiling the
entire command-line tool. This refactoring enables clean separation

    [5 lines not shown]
DeltaFile
+481-530llvm/tools/llvm-ir2vec/llvm-ir2vec.cpp
+201-0llvm/tools/llvm-ir2vec/llvm-ir2vec.h
+682-5302 files

LLVM/project b4b5bfallvm/lib/CodeGen UnreachableBlockElim.cpp

[CodeGen][NPM] Update MPDT similar to MDT after unreachable BB elimination (#172421)

After unreachable machine basic blocks are removed, MPDT should also be
updated with the latest block numbers alongside MDT.
DeltaFile
+21-5llvm/lib/CodeGen/UnreachableBlockElim.cpp
+21-51 files

LLVM/project 4e89e71llvm/lib/CodeGen CodeGenPrepare.cpp

[CodeGenPrepare][NPM] Remove incorrect LoopAnalysis preservation in CodeGenPrepare (#172418)

CodeGenPrepare modifies and restructures loops & control flow. So, it
shouldn't preserve LoopAnalysis.

The test `llvm/test/CodeGen/AMDGPU/cf-loop-on-constant.ll` shows
CodeGenPrepare modifying loop structure, hence we cannot preserve
LoopAnalysis.
DeltaFile
+0-1llvm/lib/CodeGen/CodeGenPrepare.cpp
+0-11 files

LLVM/project 6a00e1dmlir/python CMakeLists.txt

try fix windows badcast
DeltaFile
+5-1mlir/python/CMakeLists.txt
+5-11 files

LLVM/project b324c9fllvm/lib/Target/DirectX DXILMemIntrinsics.cpp DXILLegalizePass.cpp, llvm/test/CodeGen/DirectX legalize-memcpy.ll legalize-memset.ll

[DirectX] Move memset and memcpy handling to a new pass. NFC (#172921)

This introduces the DXILMemIntrinsics pass and moves memset and memcpy
handling from DXILLegalize to here. We need to do this so that we can
handle memory intrinsics before the DXILResourceAccess pass so that we
can properly deal with arrays and large structures in resources.
DeltaFile
+188-0llvm/lib/Target/DirectX/DXILMemIntrinsics.cpp
+2-166llvm/lib/Target/DirectX/DXILLegalizePass.cpp
+154-0llvm/test/CodeGen/DirectX/MemIntrinsics/memcpy.ll
+0-154llvm/test/CodeGen/DirectX/legalize-memcpy.ll
+100-0llvm/test/CodeGen/DirectX/MemIntrinsics/memset.ll
+0-99llvm/test/CodeGen/DirectX/legalize-memset.ll
+444-4196 files not shown
+482-42012 files

LLVM/project 5a4f9admlir/python CMakeLists.txt

try fix windows badcast
DeltaFile
+5-1mlir/python/CMakeLists.txt
+5-11 files

LLVM/project f171b43compiler-rt/test/asan/TestCases/Darwin atos-symbolized-recover.cpp

[Test][NFC] Update test to match new warning output (#172950)

https://github.com/llvm/llvm-project/pull/170815

rdar://166742792
DeltaFile
+1-1compiler-rt/test/asan/TestCases/Darwin/atos-symbolized-recover.cpp
+1-11 files

LLVM/project 6f6e072mlir/python CMakeLists.txt

try fix windows badcast
DeltaFile
+4-0mlir/python/CMakeLists.txt
+4-01 files

LLVM/project 0e03199llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv fixed-vectors-fp-splat-bf16.ll

[RISCV][llvm] Remove custom legalization of fixed-length vector SPLAT_VECTOR (#172870)

BUILD_VECTOR is combined to SPLAT_VECTOR if operation action of
SPLAT_VECTOR is not Expand. However we already have custom handle of
BUILD_VECTOR for fixed-length vector which has explicit constant VL
instead of making it VLMAX if lowered through SPLAT_VECTOR.
DeltaFile
+8-7llvm/test/CodeGen/RISCV/rvv/fixed-vectors-fp-splat-bf16.ll
+5-5llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+13-122 files

LLVM/project 24c7b4emlir/include/mlir/Dialect/AMDGPU/IR AMDGPU.td, mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[mlir][amdgpu] implement amdgpu.sparse_mfma wrapper for smfmac instructions (#171968)

Signed-off-by: Eric Feng <Eric.Feng at amd.com>
DeltaFile
+172-6mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+88-0mlir/include/mlir/Dialect/AMDGPU/IR/AMDGPU.td
+72-0mlir/lib/Dialect/AMDGPU/IR/AMDGPUDialect.cpp
+64-0mlir/test/Dialect/AMDGPU/invalid.mlir
+63-0mlir/test/Conversion/AMDGPUToROCDL/sparse-mfma.mlir
+61-0mlir/test/Conversion/AMDGPUToROCDL/sparse-mfma-gfx950.mlir
+520-66 files

LLVM/project a40f444llvm/docs NVPTXUsage.rst, llvm/include/llvm/IR IntrinsicsNVVM.td

[NVPTX] Add support for barrier.cta.red.* instructions (#172541)

This change adds full support for the ptx `barrier.cta.red` instruction,
following the same conventions as are already used for
`barrier.cta.sync` and `barrier.cta.arrive`.

In addition this MR removes the following intrinsics which are no longer
needed:
* llvm.nvvm.barrier0.popc -->
  llvm.nvvm.barrier.cta.red.popc.aligned.all(0, c)
* llvm.nvvm.barrier0.and -->
  llvm.nvvm.barrier.cta.red.and.aligned.all(0, z)
* llvm.nvvm.barrier0.or -->
  llvm.nvvm.barrier.cta.red.or.aligned.all(0, z)
DeltaFile
+174-0llvm/test/CodeGen/NVPTX/barrier.ll
+59-55llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+43-34llvm/include/llvm/IR/IntrinsicsNVVM.td
+32-7llvm/docs/NVPTXUsage.rst
+19-1llvm/test/Assembler/auto_upgrade_nvvm_intrinsics.ll
+18-0llvm/lib/IR/AutoUpgrade.cpp
+345-974 files not shown
+365-11110 files

LLVM/project 7ee923dclang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/lib/CIR/Lowering/DirectToLLVM LowerToLLVM.cpp

[CIR] Upstream convert to mask builtins in CIR codegen (#171694)

This PR is part of https://github.com/llvm/llvm-project/issues/167752.
It upstreams the codegen and tests for the convert to mask builtins
implemented in the incubator, including:

Upstream X86 mask conversion builtins from clangir:
- cvtmask2b/w/d/q*
- cvtb/w/d/q2mask* 

Upstreamed helpers:
- emitX86MaskedCompare()
- emitX86ConvertToMask()
- emitX86SExtMask()
DeltaFile
+130-18clang/test/CIR/CodeGenBuiltins/X86/avx512vldq-builtins.c
+122-0clang/test/CIR/CodeGenBuiltins/X86/avx512vlbw-builtins.c
+114-0clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+41-42clang/test/CIR/CodeGenBuiltins/X86/avx512dq-builtins.c
+48-12clang/test/CIR/CodeGenBuiltins/X86/avx512bw-builtins.c
+8-3clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+463-756 files

LLVM/project 6e7a449mlir/lib/ExecutionEngine ExecutionEngine.cpp

[MLIR] Enable dylib init/deinit in execution engine on AArch64 platform (#172833)

This PR enables JIT initialize for AArch64. Up to now it was disabled
because of #71963 which was recently fixed by #71968.
DeltaFile
+9-10mlir/lib/ExecutionEngine/ExecutionEngine.cpp
+9-101 files

LLVM/project 133253dclang/include/clang/Basic DiagnosticFrontendKinds.td, clang/lib/Frontend ASTUnit.cpp

[clang] Generalize remaining diagnostics that assume all precompiled files are pchs, NFC (#172718)

DeltaFile
+2-2clang/lib/Frontend/ASTUnit.cpp
+1-2clang/include/clang/Basic/DiagnosticFrontendKinds.td
+3-42 files

LLVM/project bae033bclang/lib/CIR/CodeGen CIRGenExprScalar.cpp CIRGenBuilder.h, clang/test/CIR/CodeGen pointer-to-data-member.cpp

[CIR] Add support for null data member pointers (#171945)

This adds the CIR support for handling null data member pointer values.
DeltaFile
+40-0clang/test/CIR/CodeGen/pointer-to-data-member.cpp
+17-0clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+10-0clang/lib/CIR/CodeGen/CIRGenBuilder.h
+67-03 files

LLVM/project 776f593libclc/clc/lib/generic/conversion clc_convert_float.inc clc_convert_integer.inc, libclc/opencl/lib/clspv/conversion convert_float.inc

[libclc][NFC] Move convert builtins from Python generator to .cl sources (#172634)

Remove the Python dependency for generating convert builtins, aligning
with how other builtins are defined.
In addition, our downstream target relies on this PR to override convert
implementations.
llvm-diff shows no changes to all bitcodes:
amdgcn--amdhsa.bc, barts-r600--.bc, cayman-r600--.bc, cedar-r600--.bc,
clspv64--.bc, clspv--.bc, cypress-r600--.bc, nvptx64--.bc,
nvptx64--nvidiacl.bc, nvptx--.bc, nvptx--nvidiacl.bc, tahiti-amdgcn--.bc
and tahiti-amdgcn-mesa-mesa3d.bc.
DeltaFile
+0-550libclc/utils/gen_convert.py
+161-0libclc/clc/lib/generic/conversion/clc_convert_float.inc
+146-0libclc/clc/lib/generic/conversion/clc_convert_integer.inc
+146-0libclc/clc/lib/generic/conversion/clc_convert_float2int.inc
+103-0libclc/clc/lib/generic/conversion/clc_convert_int2float.cl
+67-0libclc/opencl/lib/clspv/conversion/convert_float.inc
+623-55019 files not shown
+1,311-62125 files

LLVM/project bb993a8llvm/include/llvm/IR RuntimeLibcalls.td, llvm/lib/Target/AArch64 AArch64ISelLowering.cpp AArch64PrologueEpilogue.cpp

RuntimeLibcalls: Add entries for stack probe functions (#167453)

DeltaFile
+33-2llvm/include/llvm/IR/RuntimeLibcalls.td
+9-3llvm/lib/Target/ARM/ARMFrameLowering.cpp
+7-2llvm/lib/Target/ARM/ARMISelLowering.cpp
+6-2llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+6-1llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+4-3llvm/lib/Target/X86/X86ISelLowering.cpp
+65-131 files not shown
+65-197 files

LLVM/project a046276libcxx/include deque string, libcxx/include/__cxx03 deque

[libc++] Add missing %{flags} substitution to clang-tidy (#171689)

Flags that should be used both for compiling and for linking are
provided through the %{flags} substitution. Our clang-tidy tests should
be using them, not only %{compile_flags}.
DeltaFile
+3-3libcxx/include/__cxx03/deque
+3-3libcxx/include/deque
+1-1libcxx/test/libcxx/clang_tidy.gen.py
+1-1libcxx/include/string
+8-84 files

LLVM/project 58c3b22mlir/include/mlir/Dialect/LLVMIR ROCDLOps.td, mlir/test/Dialect/LLVMIR rocdl.mlir

[mlir][rocdl] Add `s_nop` intrinsic (#172918)

Also, cleaned some whitespace in affected files.
DeltaFile
+46-41mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+21-14mlir/test/Target/LLVMIR/rocdl.mlir
+8-1mlir/test/Dialect/LLVMIR/rocdl.mlir
+75-563 files

LLVM/project 747e473llvm/lib/Target/LoongArch LoongArchMachineFunctionInfo.h

Update the incoming ByVal args signature in MFI
DeltaFile
+2-2llvm/lib/Target/LoongArch/LoongArchMachineFunctionInfo.h
+2-21 files