LLVM/project f754276mlir/lib/Conversion/RaiseWasm RaiseWasmMLIR.cpp, mlir/test/Conversion/RaiseWasm wasm-div-to-arith-div.mlir wasm-convert-to-arith-tofp.mlir

Revert "[MLIR][WASM] Introduce the RaiseWasmMLIRPass to convert WasmSSA MLIR …"

This reverts commit a38998941b2f57ffce38d6161a48d59d7d481964.
DeltaFile
+0-469mlir/lib/Conversion/RaiseWasm/RaiseWasmMLIR.cpp
+0-109mlir/test/Conversion/RaiseWasm/wasm-div-to-arith-div.mlir
+0-81mlir/test/Conversion/RaiseWasm/wasm-convert-to-arith-tofp.mlir
+0-80mlir/test/Conversion/RaiseWasm/wasm-sub-to-arith-sub.mlir
+0-79mlir/test/Conversion/RaiseWasm/wasm-add-to-arith-add.mlir
+0-78mlir/test/Conversion/RaiseWasm/wasm-mul-to-arith-mul.mlir
+0-89634 files not shown
+1-1,86840 files

LLVM/project a389989mlir/lib/Conversion/RaiseWasm RaiseWasmMLIR.cpp, mlir/test/Conversion/RaiseWasm wasm-div-to-arith-div.mlir wasm-convert-to-arith-tofp.mlir

[MLIR][WASM] Introduce the RaiseWasmMLIRPass to convert WasmSSA MLIR to core dialects (#164562)

This is following https://github.com/llvm/llvm-project/pull/154674 and
still related to
https://discourse.llvm.org/t/rfc-mlir-dialect-for-webassembly/86758.

This PR introduces the RaiseWasmMLIRPass. This pass lowers WasmSSA MLIR
to other dialects of the LLVM ecosystem (namely: arith, math, cf and
memref).
This is the first PR of a series of 2 or 3 to introduce the lowering, as
an introduction it brings support for function calls, local and global
variables and handling of arithmetic operations. As explained in the
RFC, most WasmSSA operations have been made to stay close to other
dialects' semantics so that conversion is trivialized.

---------

Signed-off-by: Ferdinand Lemaire <flemairen6 at gmail.com>
Co-authored-by: Ferdinand Lemaire <ferdinand.lemaire at woven-planet.global>
Co-authored-by: Ferdinand Lemaire <flemairen6 at gmail.com>
DeltaFile
+469-0mlir/lib/Conversion/RaiseWasm/RaiseWasmMLIR.cpp
+109-0mlir/test/Conversion/RaiseWasm/wasm-div-to-arith-div.mlir
+81-0mlir/test/Conversion/RaiseWasm/wasm-convert-to-arith-tofp.mlir
+80-0mlir/test/Conversion/RaiseWasm/wasm-sub-to-arith-sub.mlir
+79-0mlir/test/Conversion/RaiseWasm/wasm-add-to-arith-add.mlir
+78-0mlir/test/Conversion/RaiseWasm/wasm-mul-to-arith-mul.mlir
+896-034 files not shown
+1,868-140 files

LLVM/project 42794a3llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll, llvm/test/CodeGen/RISCV clmul.ll

Merge branch 'users/ikudrin/clang-findallocationfunction-simplify' into users/ikudrin/clang-cwg2282
DeltaFile
+25,784-36,416llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+12,227-23,140llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+4,004-11,142llvm/test/CodeGen/RISCV/clmul.ll
+6,940-6,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+3,502-9,174llvm/test/CodeGen/X86/clmul-vector.ll
+3,985-7,989llvm/test/CodeGen/Thumb2/mve-clmul.ll
+56,442-94,6431,844 files not shown
+134,112-159,5141,850 files

LLVM/project 601422allvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll, llvm/test/CodeGen/RISCV clmul.ll

Merge branch 'main' into users/ikudrin/clang-findallocationfunction-simplify
DeltaFile
+25,784-36,416llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+12,227-23,140llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+4,004-11,142llvm/test/CodeGen/RISCV/clmul.ll
+6,940-6,782llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+3,502-9,174llvm/test/CodeGen/X86/clmul-vector.ll
+3,985-7,989llvm/test/CodeGen/Thumb2/mve-clmul.ll
+56,442-94,6431,843 files not shown
+134,111-159,5131,849 files

LLVM/project 4cc954fclang/lib/Sema SemaExprCXX.cpp

fixup! Pass the argument list without the alignment argument to the msvc-specific fallback
DeltaFile
+6-2clang/lib/Sema/SemaExprCXX.cpp
+6-21 files

LLVM/project 4c16440clang/test/OpenMP nvptx_teams_reduction_codegen.cpp target_teams_reduction_codegen.cpp, llvm/lib/Frontend/OpenMP OMPIRBuilder.cpp

Revert "[OpenMP][offload] Cross-team reductions with variable number of teams" (#204914)

Reverts llvm/llvm-project#195102 due to some missed debug info issue
revealed by https://lab.llvm.org/buildbot/#/builders/67/builds/7022
DeltaFile
+3,642-0clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
+0-2,331clang/test/OpenMP/target_teams_reduction_codegen.cpp
+170-156openmp/device/src/Reduction.cpp
+73-144llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+60-60clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp
+60-60clang/test/OpenMP/teams_distribute_parallel_for_schedule_codegen.cpp
+4,005-2,751168 files not shown
+5,535-4,267174 files

LLVM/project 0862357clang/lib/Sema SemaExprCXX.cpp

fixup! Do not try the msvc-specific fallback with the alignment argument
DeltaFile
+9-2clang/lib/Sema/SemaExprCXX.cpp
+9-21 files

LLVM/project 0d1d2f3clang/test/OpenMP nvptx_teams_reduction_codegen.cpp target_teams_reduction_codegen.cpp, llvm/lib/Frontend/OpenMP OMPIRBuilder.cpp

Revert "[OpenMP][offload] Cross-team reductions with variable number of teams…"

This reverts commit e9acb01904be7c32e98dedee27b68f939d79549a.
DeltaFile
+3,642-0clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
+0-2,331clang/test/OpenMP/target_teams_reduction_codegen.cpp
+170-156openmp/device/src/Reduction.cpp
+73-144llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+60-60clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp
+60-60clang/test/OpenMP/teams_distribute_parallel_for_schedule_codegen.cpp
+4,005-2,751168 files not shown
+5,535-4,267174 files

LLVM/project e9acb01clang/test/OpenMP nvptx_teams_reduction_codegen.cpp target_teams_reduction_codegen.cpp, llvm/lib/Frontend/OpenMP OMPIRBuilder.cpp

[OpenMP][offload] Cross-team reductions with variable number of teams (#195102)

This is a part of a series of patches that rework OpenMP cross-team
reductions.

This patch changes the cross-team reduction runtime to no longer work
through larger number of teams in chunks. Instead, we allocate a
suitable-sized global buffer for the team values and let all teams run
at once. The last team that finishes uses a strided loop to reduce the
team values from the global buffer.

We also use `mapping::getNumberOfThreadsInBlock()` instead of
`omp_get_num_threads()` because the reduction of the team values runs
outside of the parallel region device code, which would make
`omp_get_num_threads()` always return 1. For Generic-SPMD mode, we also
want to use all available threads, which means that we need to copy the
reduction data from LDS (where it lives in that mode by default) to
scratch in codegen before calling the cross-team reduction.


    [48 lines not shown]
DeltaFile
+0-3,642clang/test/OpenMP/nvptx_teams_reduction_codegen.cpp
+2,331-0clang/test/OpenMP/target_teams_reduction_codegen.cpp
+155-169openmp/device/src/Reduction.cpp
+144-73llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+60-60clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp
+60-60clang/test/OpenMP/teams_distribute_parallel_for_schedule_codegen.cpp
+2,750-4,004168 files not shown
+4,266-5,534174 files

LLVM/project 2678b8fllvm/lib/Target/DirectX DXILResourceAccess.cpp DXILOpLowering.cpp, llvm/test/CodeGen/DirectX/ResourceAccess load-constant-buffer-t.ll

[DirectX] Handle llvm.dx.resource.getbasepointer intrinsic in DXILResourceAccess pass (#204732)

The `llvm.dx.resource.getbasepointer` intrinsic is emitted for
`Constantbuffer<T>` element access and needs to be translated to
`llvm.dx.resource.load.cbufferrow` calls in the `DXILResourceAccess`
pass. The handling is identical to `llvm.dx.resource.getpointer` with a
0 offset.

Fixes #204234
DeltaFile
+189-0llvm/test/CodeGen/DirectX/ResourceAccess/load-constant-buffer-t.ll
+12-3llvm/lib/Target/DirectX/DXILResourceAccess.cpp
+1-0llvm/lib/Target/DirectX/DXILOpLowering.cpp
+202-33 files

LLVM/project 4de1cb8clang/lib/Sema SemaExprCXX.cpp, clang/test/CXX/drs cwg22xx.cpp cwg5xx.cpp

fixup! do not restrict the patch to C++20
DeltaFile
+10-16clang/test/CXX/drs/cwg22xx.cpp
+7-8clang/test/SemaCXX/new-delete.cpp
+5-8clang/test/CXX/expr/expr.unary/expr.new/p14.cpp
+1-6clang/lib/Sema/SemaExprCXX.cpp
+2-3clang/test/CXX/drs/cwg5xx.cpp
+1-3clang/test/SemaCXX/std-align-val-t-in-operator-new.cpp
+26-441 files not shown
+27-457 files

LLVM/project 359bfe6clang/docs LifetimeSafety.rst, clang/include/clang/Basic LangOptions.h

[LifetimeSafety] Allow configuring lifetimebound fix-it spelling (#204045)

When suggesting `[[clang::lifetimebound]]` fix-its, allow users to
provide a project-specific macro spelling with
`-lifetime-safety-lifetimebound-macro=...`.

If no spelling is configured, use a visible macro whose replacement
tokens spell the attribute, preferring the most recently defined
matching macro, and fall back to `[[clang::lifetimebound]]` or
`__attribute((lifetimebound))` otherwise.

Closes https://github.com/llvm/llvm-project/issues/200232
DeltaFile
+76-0clang/test/Sema/LifetimeSafety/annotation-suggestions-fixits.cpp
+49-2clang/test/Sema/LifetimeSafety/misplaced-lifetimebound-intra-tu.cpp
+31-6clang/lib/Sema/SemaLifetimeSafety.h
+9-0clang/include/clang/Options/Options.td
+7-1clang/docs/LifetimeSafety.rst
+3-0clang/include/clang/Basic/LangOptions.h
+175-96 files

LLVM/project 0928584clang/lib/Format FormatTokenLexer.cpp FormatTokenLexer.h

[clang-format][NFC] Clean up FormatTokenLexer (#203825)
DeltaFile
+11-4clang/lib/Format/FormatTokenLexer.cpp
+0-1clang/lib/Format/FormatTokenLexer.h
+11-52 files

LLVM/project e47530bbolt/include/bolt/Core BinaryContext.h, bolt/lib/Passes Aligner.cpp LongJmp.cpp

[BOLT][AArch64] Align tentative layout bases using per-section alignment (#204262)

Move `AssignSections` pass before `AlignerPass` so it can record the max
code alignment per output section, then align the tentative hot/cold
section bases using the recorded alignment, which makes tentative layout
better match actually emitted.
DeltaFile
+24-0bolt/include/bolt/Core/BinaryContext.h
+20-0bolt/lib/Passes/Aligner.cpp
+8-3bolt/lib/Passes/LongJmp.cpp
+5-3bolt/lib/Rewrite/BinaryPassManager.cpp
+57-64 files

LLVM/project b32488fclang/lib/CodeGen CGExprCXX.cpp, clang/test/CodeGen ubsan-aggregate-null-align-bounds.c

[Clang][UBSan] Use EmitCheckedLValue for C++ trivial operator= operands (#203737)

Further to https://github.com/llvm/llvm-project/pull/190739, use
EmitCheckedLValue for trivial operator= operands
* for the LHS (`lhs->` not handled yet), and
* for the RHS also for function call syntax.
DeltaFile
+46-23clang/test/CodeGen/ubsan-aggregate-null-align-bounds.c
+27-16clang/lib/CodeGen/CGExprCXX.cpp
+73-392 files

LLVM/project ba5384allvm/include/llvm/Support CommandLine.h, llvm/lib/Support CommandLine.cpp

[Support] Add a parser for cl::opt<ElementCount> (#203969)

This adds command-line option parsing support for ElementCount.

This allows the following syntax:
```
  --my-option=4 ; Maps to ElementCount::getFixed(4)
  --my-option="vscale x 8" ; Maps to ElementCount::getScalable(8)
```
This is intended to unify fixed/scalable option handling in the loop
vectorizer. Currently, we have options like
'`EpilogueVectorizationForceVF`' defined as `cl::opt<unsigned>` which do
not allow specifying scalable VFs.

Assisted-by: Codex
DeltaFile
+85-0llvm/unittests/Support/CommandLineTest.cpp
+46-0llvm/lib/Support/CommandLine.cpp
+23-0llvm/include/llvm/Support/CommandLine.h
+154-03 files

LLVM/project a8aba70flang/lib/Lower ConvertVariable.cpp MultiImageFortran.cpp, flang/test/Lower/MIF coarray_allocation5.f90 coarray_allocation4.f90

[Flang] Standardize coarray TODO() diagnostic messages (#204708)
DeltaFile
+5-4flang/lib/Lower/ConvertVariable.cpp
+3-3flang/lib/Lower/MultiImageFortran.cpp
+3-1flang/lib/Lower/Bridge.cpp
+1-1flang/test/Lower/MIF/coarray_allocation5.f90
+1-1flang/test/Lower/MIF/coarray_allocation4.f90
+1-1flang/test/Lower/MIF/coarray_allocation3.f90
+14-112 files not shown
+16-138 files

LLVM/project c890f4dutils/bazel/llvm-project-overlay/mlir BUILD.bazel

[Bazel] Fixes 95e3219 (#204873)

This fixes 95e321951ad3041998e49bc0353482bcd27c65db.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+1-01 files

LLVM/project 5e7727doffload/ci openmp-offload-amdgpu-libc-runtime.py

Revert "Revert "[AMDGPU] Add compiler-rt checks for the GPU runtime" (#204370)"

This reverts commit 24f4fbf89d7e1c6e7b00efde469adb0a8c529cd2.
DeltaFile
+7-0offload/ci/openmp-offload-amdgpu-libc-runtime.py
+7-01 files

LLVM/project 90b2048llvm/lib/Bitcode/Reader BitcodeReader.cpp, llvm/test/Bitcode invalid-summary-version.test

bitcode: Improve invalid summary version error (#204888)
DeltaFile
+3-4llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+5-0llvm/test/Bitcode/invalid-summary-version.test
+0-0llvm/test/Bitcode/Inputs/invalid-summary-version.bc
+8-43 files

LLVM/project f9fa598llvm/test/CodeGen/AMDGPU rem_i128.ll div_v2i128.ll

[AMDGPU] Use explicit carry nodes for i64 wide integer lowering (#204694)

This PR switches widened i64 add/sub lowering to use explicit
UADDO/USUBO carry
nodes instead of glue-based carry chains.
DeltaFile
+1,255-1,278llvm/test/CodeGen/AMDGPU/rem_i128.ll
+950-975llvm/test/CodeGen/AMDGPU/div_v2i128.ll
+758-780llvm/test/CodeGen/AMDGPU/div_i128.ll
+460-514llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system.ll
+226-250llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
+192-216llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system_noprivate.ll
+3,841-4,01317 files not shown
+4,729-4,74523 files

LLVM/project 086f633llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.load.async.to.lds.ll

AMDGPU/GlobalISel: RegBankLegalize rules for load_async_to_lds (#204683)
DeltaFile
+2-1llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.load.async.to.lds.ll
+3-22 files

LLVM/project 4195b29.github/workflows subscriber.yml

workflows/subscriber: Update to latest github automation container (#204692)

This one is about 33% smaller than the previous version.
DeltaFile
+1-1.github/workflows/subscriber.yml
+1-11 files

LLVM/project 39f8f90llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp, llvm/test/CodeGen/SPIRV/instructions undef-composite.ll

[SPIR-V] Lower undef nested in a constant aggregate (#204377)

A constant aggregate whose element is itself an aggregate `undef` was
never lowered to a placeholder. The raw aggregate operand reached
IRTranslator on the llvm.spv.const.composite call and aborted with
"unable to translate instruction".

A similar issue was found and fixed during SPV_KHR_poison_freeze
implementation. So instead of re-inventing a wheel - unify lowering with
poison.

Addresses the following observation:
https://github.com/llvm/llvm-project/pull/198037#discussion_r3304013315
DeltaFile
+61-71llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+45-0llvm/test/CodeGen/SPIRV/instructions/undef-composite.ll
+106-712 files

LLVM/project 6f05646llvm/include/llvm/Transforms/Vectorize SLPVectorizer.h, llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+249-15llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+21-191llvm/test/Transforms/SLPVectorizer/X86/masked-stores.ll
+2-1llvm/include/llvm/Transforms/Vectorize/SLPVectorizer.h
+272-2073 files

LLVM/project fe9521dllvm/lib/Transforms/Vectorize VPlanRecipes.cpp VPlan.cpp

[LV] Unify header phi fixup and remove fixNonInductionPHIs (NFC). (#204886)

Unify the execute logic for VPPhi and VPWidenPHIRecipe into a shared
executePhiRecipe helper that handles both scalar and vector phis. For
header phis, only the preheader incoming value is added during execute;
the backedge is fixed up later by VPlan::execute().

This allows generalizing the VPlan::execute() fixup loop to handle all
loop headers (not just the first), removing the VPWidenPHIRecipe skip,
and eliminating fixNonInductionPHIs entirely.
DeltaFile
+22-19llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+15-22llvm/lib/Transforms/Vectorize/VPlan.cpp
+0-22llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+37-633 files

LLVM/project 7472a0ellvm/lib/IR Verifier.cpp, llvm/test/Verifier x86-amx-tile-register-index.ll

[Verifier] Verify AMX tile-register index operands are in range

AMX has 8 physical tile registers (TMM0-TMM7), so the tile-index operands
of the AMX intrinsics must be in [0, 8): operand 0 for the tile
load/store/zero intrinsics, operands 0-2 for the tdp* family.
DeltaFile
+30-0llvm/test/Verifier/x86-amx-tile-register-index.ll
+24-0llvm/lib/IR/Verifier.cpp
+54-02 files

LLVM/project bd70fc0llvm/lib/Bitcode/Reader BitcodeReader.cpp, llvm/test/Bitcode invalid-summary-version.test

bitcode: Improve invalid summary version error

Include the filename in the description.
DeltaFile
+3-4llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+5-0llvm/test/Bitcode/invalid-summary-version.test
+0-0llvm/test/Bitcode/Inputs/invalid-summary-version.bc
+8-43 files

LLVM/project 776cea3llvm/test/CodeGen/AMDGPU rem_i128.ll div_v2i128.ll

[AMDGPU] Use explicit carry nodes for i64 wide integer lowering

This PR switches widened i64 add/sub lowering to use explicit UADDO/USUBO carry
nodes instead of glue-based carry chains.
DeltaFile
+1,255-1,278llvm/test/CodeGen/AMDGPU/rem_i128.ll
+950-975llvm/test/CodeGen/AMDGPU/div_v2i128.ll
+758-780llvm/test/CodeGen/AMDGPU/div_i128.ll
+460-514llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system.ll
+226-250llvm/test/CodeGen/AMDGPU/flat_atomics_i64.ll
+192-216llvm/test/CodeGen/AMDGPU/flat_atomics_i64_system_noprivate.ll
+3,841-4,01317 files not shown
+4,729-4,74523 files

LLVM/project 2f0ae3allvm/lib/Bitcode/Reader BitcodeReader.cpp, llvm/test/Bitcode invalid-summary-version.test

bitcode: Improve invalid summary version error

Include the filename in the description.
DeltaFile
+5-0llvm/test/Bitcode/invalid-summary-version.test
+2-1llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+0-0llvm/test/Bitcode/Inputs/invalid-summary-version.bc
+7-13 files