LLVM/project ff3d550clang/lib/AST/ByteCode InterpBuiltin.cpp

[clang][bytecode][NFC] Add popToUInt64() to builtin evaluation (#170164)

We often don't need the APSInt at all, so add a version that pops the
integral from the stack and just static_casts to uint64_t.
DeltaFile
+26-26clang/lib/AST/ByteCode/InterpBuiltin.cpp
+26-261 files

LLVM/project 867d353llvm/include/llvm/Frontend/OpenMP OMPIRBuilder.h, llvm/lib/Frontend/OpenMP OMPIRBuilder.cpp

[OpenMP][flang] Support GPU team-reductions on allocatables (#169651)

Extends the work started in #165714 by supporting team reductions.
Similar to what was done in #165714, this PR introduces proper
allocations, loads, and stores for by-ref reductions in teams-related
callbacks:
* `_omp_reduction_list_to_global_copy_func`,
* `_omp_reduction_list_to_global_reduce_func`,
* `_omp_reduction_global_to_list_copy_func`, and
* `_omp_reduction_global_to_list_reduce_func`.
DeltaFile
+148-47llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp
+121-0mlir/test/Target/LLVMIR/allocatable_gpu_reduction_teams.mlir
+14-10llvm/include/llvm/Frontend/OpenMP/OMPIRBuilder.h
+2-0mlir/test/Target/LLVMIR/allocatable_gpu_reduction.mlir
+285-574 files

LLVM/project 728cadalldb/docs dil-expr-lang.ebnf, lldb/include/lldb/ValueObject DILAST.h DILParser.h

[LLDB] Add type casting to DIL, part 1 of 3. (#165199)

This is an alternative to
https://github.com/llvm/llvm-project/pull/159500, breaking that PR down
into three separate PRs, to make it easier to review.

This first PR of the three adds the basic framework for doing type
casing to the DIL code, but it does not actually do any casting: In this
PR the DIL parser only recognizes builtin type names, and the DIL
interpreter does not do anything except return the original operand (no
casting). The second and third PRs will add most of the type parsing,
and do the actual type casting, respectively.
DeltaFile
+163-4lldb/source/ValueObject/DILParser.cpp
+33-0lldb/include/lldb/ValueObject/DILAST.h
+25-3lldb/docs/dil-expr-lang.ebnf
+12-0lldb/source/ValueObject/DILEval.cpp
+6-0lldb/include/lldb/ValueObject/DILParser.h
+4-0lldb/source/ValueObject/DILAST.cpp
+243-71 files not shown
+244-77 files

LLVM/project fbdf8abllvm/test/CodeGen/AMDGPU amdgpu-codegenprepare-idiv.ll fshr.ll, llvm/test/Transforms/LoadStoreVectorizer/AMDGPU merge-vectors-complex.ll

[LSV] Merge contiguous chains across scalar types (#154069)

This change enables the LoadStoreVectorizer to merge and vectorize
contiguous chains even when their scalar element types differ, as long
as the total bitwidth matches. To do so, we rebase offsets between
chains, normalize value types to a common integer type, and insert the
necessary casts around loads and stores. This uncovers more
vectorization opportunities and explains the expected codegen updates
across AMDGPU tests.

Key changes:
- Chain merging
  - Build contiguous subchains and then merge adjacent ones when:
- They refer to the same underlying pointer object and address space.
    - They are either all loads or all stores.
    - A constant leader-to-leader delta exists.
- Rebasing one chain into the other's coordinate space does not overlap.
    - All elements have equal total bit width.
- Rebase the second chain by the computed delta and append it to the

    [22 lines not shown]
DeltaFile
+834-801llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
+197-206llvm/test/CodeGen/AMDGPU/fshr.ll
+171-162llvm/test/CodeGen/AMDGPU/fdiv.ll
+258-66llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-vectors-complex.ll
+192-120llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.gfx950.ll
+141-149llvm/test/CodeGen/AMDGPU/fshl.ll
+1,793-1,50444 files not shown
+3,995-2,78250 files

LLVM/project 039f883mlir/lib/Dialect/Tensor/Transforms BufferizableOpInterfaceImpl.cpp, mlir/test/Dialect/Tensor bufferize.mlir

[mlir][tensor] Fix bug in `ConcatOpInterface` (#168676)

This PR fixes an issue in `ConcatOpInterface` where `tensor.concat`
fails when the concat dimension is dynamic while the result type is
static. The fix unifies the computation by using `OpFoldResult`,
avoiding the need to separately handle dynamic and static dimension
values. Fixes #162776.
DeltaFile
+14-40mlir/lib/Dialect/Tensor/Transforms/BufferizableOpInterfaceImpl.cpp
+33-7mlir/test/Dialect/Tensor/bufferize.mlir
+47-472 files

LLVM/project 0a03b7eclang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/test/CIR/CodeGenBuiltins/X86 vec-set-builtins.c

[CIR] Upstream CIR codegen for vec_set x86 builtin (#169265)

Support CIR codegen for x86 builtin vec_set.
DeltaFile
+141-0clang/test/CIR/CodeGenBuiltins/X86/vec-set-builtins.c
+20-5clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+161-52 files

LLVM/project 91531f3llvm/lib/AsmParser LLParser.cpp, llvm/lib/Bitcode/Reader BitcodeReader.cpp

[ThinLTO] Fix parsing null aliasee in alias summary (#169490)

In
https://github.com/llvm/llvm-project/commit/f8182f1aef5b6ec74cbe2c1618e759f0113921ba,
we add support for printing "null" aliasee in AsmWriter, but missing
support in LLParser.
DeltaFile
+30-26llvm/test/Assembler/thinlto-summary.ll
+16-15llvm/lib/AsmParser/LLParser.cpp
+12-7llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+7-4llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+65-524 files

LLVM/project 8dc6abbmlir/lib/Analysis/Presburger Matrix.cpp

[mlir][presburger] Implement moveColumns using std::rotate (#168243)

DeltaFile
+15-26mlir/lib/Analysis/Presburger/Matrix.cpp
+15-261 files

LLVM/project 28e2004clang/test/SemaCUDA device-kernel-call.cu

[clang][CUDA] Clean up tests from device-side kernel call support. NFC

- Remove unused 'CHECK' from the CUDASema test
DeltaFile
+0-8clang/test/SemaCUDA/device-kernel-call.cu
+0-81 files

LLVM/project da1a887flang/test/Lower module-debug-file-loc-linux.f90, flang/test/Transforms debug-dwarf-version.fir debug-line-table-existing.fir

[flang] Enable debug test on AIX (NFC) (#169945)

DeltaFile
+1-1flang/test/Transforms/debug-dwarf-version.fir
+1-1flang/test/Transforms/debug-line-table-existing.fir
+1-1flang/test/Transforms/debug-line-table-inc-same-file.fir
+1-1flang/test/Transforms/debug-line-table-inc-file.fir
+1-1flang/test/Lower/module-debug-file-loc-linux.f90
+5-55 files

LLVM/project 744480autils/bazel overlay_directories.py configure.bzl

[bazel] Rewrite overlay handling to starlark (#170000)

Starlark is perfectly capable of doing what we need and this avoids the
dependency on a host Python
DeltaFile
+0-99utils/bazel/overlay_directories.py
+43-35utils/bazel/configure.bzl
+43-1342 files

LLVM/project 47436abllvm/lib/Target/AMDGPU AMDGPUCallLowering.cpp

Use MF variable
DeltaFile
+1-1llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+1-11 files

LLVM/project 10aaf8cllvm/lib/Target/AMDGPU AMDGPUCallLowering.cpp

Fix build
DeltaFile
+2-2llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+2-21 files

LLVM/project 1f794e6llvm/test/CodeGen/RISCV/rvv fminimumnum-sdnode.ll

[NFC][RISCV] Correct fminimumnum test case (#170169)

The test case mismatch was introduced in #135727
DeltaFile
+222-222llvm/test/CodeGen/RISCV/rvv/fminimumnum-sdnode.ll
+222-2221 files

LLVM/project 0e3b67fllvm/lib/Target/AMDGPU AMDGPUCallLowering.cpp

Use helper
DeltaFile
+1-3llvm/lib/Target/AMDGPU/AMDGPUCallLowering.cpp
+1-31 files

LLVM/project 2721bf1llvm/lib/Target/AMDGPU AMDGPULegalizerInfo.cpp

Remove comment
DeltaFile
+0-1llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+0-11 files

LLVM/project 755733elldb/include/lldb lldb-forward.h, lldb/include/lldb/Target ExecutionContext.h StackFrame.h

[lldb/Target] Track containing StackFrameList to avoid circular dependencies (#170226)

This change adds tracking of the StackFrameList that produced each frame
by storing a weak pointer (m_frame_list_wp) in both `StackFrame` and
`ExecutionContextRef`.

When resolving frames through `ExecutionContextRef::GetFrameSP`, the
code now first attempts to use the remembered frame list instead of
immediately calling `Thread::GetStackFrameList`. This breaks circular
dependencies that can occur during frame provider initialization, where
creating a frame provider might trigger `ExecutionContext` resolution,
which would then call back into `Thread::GetStackFrameList()`, creating
an infinite loop.

The `StackFrameList` now sets m_frame_list_wp on every frame it creates,
and a new virtual method `GetOriginatingStackFrameList` allows frames to
expose their originating list.

Signed-off-by: Med Ismail Bennani <ismail at bennani.ma>
DeltaFile
+15-2lldb/source/Target/ExecutionContext.cpp
+12-2lldb/include/lldb/Target/ExecutionContext.h
+14-0lldb/include/lldb/Target/StackFrame.h
+5-0lldb/source/Target/StackFrameList.cpp
+1-1lldb/include/lldb/Target/StackFrameList.h
+1-0lldb/include/lldb/lldb-forward.h
+48-56 files

LLVM/project 326ee7aclang-tools-extra/docs ReleaseNotes.rst, clang-tools-extra/test/clang-tidy/checkers/misc const-correctness-pointer-as-pointers.cpp

[clang-tidy] Fix false positive in `misc-const-correctness` (#170065)

Closes https://github.com/llvm/llvm-project/issues/131599 and
https://github.com/llvm/llvm-project/issues/170033
DeltaFile
+17-0clang/unittests/Analysis/ExprMutationAnalyzerTest.cpp
+15-0clang-tools-extra/test/clang-tidy/checkers/misc/const-correctness-pointer-as-pointers.cpp
+5-0clang/lib/Analysis/ExprMutationAnalyzer.cpp
+2-1clang-tools-extra/docs/ReleaseNotes.rst
+39-14 files

LLVM/project 91e8780clang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/CodeGen CIRGenBuiltin.cpp

[CIR] Upstream Builtin FloorOp (#169954)

DeltaFile
+21-0clang/test/CIR/CodeGenBuiltins/builtins-floating-point.c
+10-0clang/lib/CIR/CodeGen/CIRGenBuiltin.cpp
+10-0clang/include/clang/CIR/Dialect/IR/CIROps.td
+9-0clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+50-04 files

LLVM/project 7cbf5b7clang/test/CodeGen attr-modular-format.c

Add a codegen test for the overwrite behavior
DeltaFile
+9-0clang/test/CodeGen/attr-modular-format.c
+9-01 files

LLVM/project a1c9f84clang/lib/Sema SemaDeclAttr.cpp

Minor cleanup
DeltaFile
+3-3clang/lib/Sema/SemaDeclAttr.cpp
+3-31 files

LLVM/project c067cceclang/lib/Sema SemaDeclAttr.cpp

clang-format
DeltaFile
+1-2clang/lib/Sema/SemaDeclAttr.cpp
+1-21 files

LLVM/project c6f501dcompiler-rt/test CMakeLists.txt, compiler-rt/test/xray lit.site.cfg.py.in

[XRay] Run tests inside bootstrapping build (#168378)

COMPILER_RT_STANDALONE_BUILD is set when doing a bootstrapping build
through LLVM_ENABLE_RUNTIMES with the CMake source directory being in
llvm/. This patch changes the XRay tests to also detect that we have
LLVM sources and the llvm-xray tool if we are in a bootstrapping build
through the use of the LLVM_TREE_AVAILABLE variable which is set in
runtimes/CMakeLists.txt.
DeltaFile
+7-0compiler-rt/test/CMakeLists.txt
+1-1compiler-rt/test/xray/lit.site.cfg.py.in
+8-12 files

LLVM/project 7272717clang/lib/Sema SemaDeclAttr.cpp, clang/test/Sema attr-modular-format.c

Warn about duplicate attributes
DeltaFile
+29-1clang/lib/Sema/SemaDeclAttr.cpp
+12-0clang/test/Sema/attr-modular-format.c
+41-12 files

LLVM/project 43d9ddbmlir/lib/Conversion/XeGPUToXeVM XeGPUToXeVM.cpp, mlir/test/Conversion/XeGPUToXeVM loadstore_matrix.mlir

support memref subview in xegpu to xevm type conversion
DeltaFile
+38-5mlir/lib/Conversion/XeGPUToXeVM/XeGPUToXeVM.cpp
+14-8mlir/test/Conversion/XeGPUToXeVM/loadstore_matrix.mlir
+52-132 files

LLVM/project e27dec5lldb/source/Plugins/InstrumentationRuntime/BoundsSafety InstrumentationRuntimeBoundsSafety.cpp InstrumentationRuntimeBoundsSafety.h, lldb/test/API/lang/BoundsSafety/soft_trap TestBoundsSafetyInstrumentationPlugin.py

[BoundsSafety][LLDB] Implement instrumentation plugin for -fbounds-safety soft traps (#169117)

This patch tries to upstream code landed downstream in
https://github.com/swiftlang/llvm-project/pull/11835.

This patch implements an instrumentation plugin for the
`-fbounds-safety` soft trap mode first implemented in
https://github.com/swiftlang/llvm-project/pull/11645 (rdar://158088757).
That functionality isn't supported in upstream Clang yet, however the
instrumented plugin can be compiled without issue so this patch tries to
upstream it. The included tests are all disabled when the clang used for
testing doesn't support `-fbounds-safety`. This means the tests will be
skipped. However, it's fairly easy to point LLDB at a clang that does
support `-fbounds-safety. I've done this and confirmed the tests pass.
To use a custom clang the following can be done:

* For API tests set the `LLDB_TEST_COMPILER` CMake cache variable to
  point to appropriate compiler.
* For shell tests applying a patch like this can be used to set the

    [43 lines not shown]
DeltaFile
+481-0lldb/source/Plugins/InstrumentationRuntime/BoundsSafety/InstrumentationRuntimeBoundsSafety.cpp
+148-0lldb/test/API/lang/BoundsSafety/soft_trap/TestBoundsSafetyInstrumentationPlugin.py
+61-0lldb/source/Plugins/InstrumentationRuntime/BoundsSafety/InstrumentationRuntimeBoundsSafety.h
+36-0lldb/test/Shell/BoundsSafety/boundssafety_soft_trap_call_with_str_no_dbg_info_null_str.test
+34-0lldb/test/Shell/BoundsSafety/boundssafety_soft_trap_call_minimal_missing_reason.test
+34-0lldb/test/Shell/BoundsSafety/boundssafety_soft_trap_call_with_str_missing_reason.test
+794-016 files not shown
+1,086-022 files

LLVM/project 677e1d0clang/lib/CIR/CodeGen CIRGenBuiltinX86.cpp, clang/test/CIR/CodeGenBuiltins/X86 avx512vl-builtins.c avx512f-builtins.c

[CIR] Upstream gather instrinsics (#169157)

DeltaFile
+201-0clang/test/CIR/CodeGenBuiltins/X86/avx512vl-builtins.c
+191-0clang/test/CIR/CodeGenBuiltins/X86/avx512f-builtins.c
+91-1clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
+483-13 files

LLVM/project be5db33libclc/opencl/lib/generic/integer bitfield_insert.cl

[libclc] Fix bitfield_insert implementation (#170208)

The `bitfield_insert` function in the OpenCL C library had an incorrect
`__CLC_BODY` definition, that included the `.inc` file for the
`__clc_bitfield_insert` declaration instead of the correct
implementation. So, the function was not defined at all, leading to
linker errors when trying to use it.
DeltaFile
+1-1libclc/opencl/lib/generic/integer/bitfield_insert.cl
+1-11 files

LLVM/project 64a7628clang/include/clang/CIR MissingFeatures.h, clang/lib/CIR/CodeGen CIRGenCoroutine.cpp

[CIR] Upstream Emit the resume branch for cir.await op (#169864)

This PR upstreams the emission of the `cir.await` resume branch.
Handling the case where the return value of `co_await` is not ignored is
deferred to a future PR, which will be added once `co_return` is
upstreamed. Additionally, the `forLValue` variable is always `false` in
the current implementation. When support for emitting `coro_yield` is
added, this variable will be set to `true`, so that work is also
deferred to a future PR.
DeltaFile
+39-0clang/lib/CIR/CodeGen/CIRGenCoroutine.cpp
+15-3clang/test/CIR/CodeGen/coro-task.cpp
+2-0clang/include/clang/CIR/MissingFeatures.h
+56-33 files

LLVM/project ace65a0lldb/test/Shell/helper toolchain.py

[LLDB] Update Shell lit config to handle c8031c3dd743 (#170225)

DeltaFile
+1-1lldb/test/Shell/helper/toolchain.py
+1-11 files