LLVM/project 63febe0llvm/lib/CodeGen/GlobalISel CallLowering.cpp

[GISel][CallLowering] Improve arg flags setting compile-time (#191761)

addFlagsUsingAttrFn is hot and showing up in compile-time profiles via
llvm::CallLowering::lowerCall. The culprit is std::function callback.
Switching to set flags based on AttributeSet directly is a -0.25%
compile-time improvement on CTMark AArch64 O0.

https://llvm-compile-time-tracker.com/compare.php?from=d35cd21a3757ab6028024f0b47bc9d802d06eae6&to=e717c7017faf2cb386f0d02715fb55d252b3ae42&stat=instructions%3Au
DeltaFile
+26-26llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+26-261 files

LLVM/project 4720d95clang-tools-extra/clang-tidy/bugprone UncheckedOptionalAccessCheck.cpp, clang-tools-extra/test/clang-tidy/checkers/bugprone unchecked-optional-access.cpp

Fix registered matcher for bugprone-unchecked-optional-access (recent changes to libcxx) (#191681)

Further fix for #187788. Previous attempt in PR #188044 only updated the
model and model tests, but forgot to update the registered matcher.
DeltaFile
+28-11clang-tools-extra/test/clang-tidy/checkers/bugprone/Inputs/unchecked-optional-access/std/types/optional.h
+28-6clang-tools-extra/test/clang-tidy/checkers/bugprone/unchecked-optional-access.cpp
+8-3clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp
+3-2clang/include/clang/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.h
+3-2clang-tools-extra/clang-tidy/bugprone/UncheckedOptionalAccessCheck.cpp
+70-245 files

LLVM/project 3ecf872clang/lib/Format Format.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] Detect language for file templates (#191502)

Fixes #191295.
DeltaFile
+9-1clang/lib/Format/Format.cpp
+4-0clang/unittests/Format/FormatTest.cpp
+13-12 files

LLVM/project 73c7a61clang/lib/Driver/ToolChains Hexagon.cpp, clang/test/Driver hexagon-toolchain-linux.c hexagon-toolchain-elf.c

[Hexagon] Add LTO options to Hexagon driver link args (#191336)

The Hexagon driver's constructHexagonLinkArgs() was not calling
addLTOOptions(). This meant that LTO plugin options weren't forwarded to
the linker.

This caused a crash when using ThinLTO with -fenable-matrix on
llvm-test-suite/SingleSource/UnitTests/matrix-types-spec.cpp:
LowerMatrixIntrinsicsPass did not run in the LTO backend because
-enable-matrix was not forwarded via -plugin-opt.

Add the addLTOOptions() call to both the musl and bare-metal code paths
in constructHexagonLinkArgs().
DeltaFile
+13-0clang/test/Driver/hexagon-toolchain-linux.c
+12-0clang/test/Driver/hexagon-toolchain-elf.c
+8-0clang/lib/Driver/ToolChains/Hexagon.cpp
+33-03 files

LLVM/project 4b2030fllvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fsub.ll llvm.amdgcn.reduce.fadd.ll

[AMDGPU] DPP wave reduction for double types - 2

Supported Ops: `fadd` and `fsub`
DeltaFile
+1,030-130llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+1,008-130llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+12-10llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2,050-2703 files

LLVM/project f03f6b8llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fmin.ll llvm.amdgcn.reduce.fmax.ll

[AMDGPU] DPP wave reduction for double types - 1

Supported Ops: `fmin` and `fmax`
DeltaFile
+1,112-234llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmin.ll
+1,112-234llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmax.ll
+27-13llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2,251-4813 files

LLVM/project aae5ff8llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.xor.ll llvm.amdgcn.reduce.and.ll

[AMDGPU] DPP wave reduction for long types - 3

Supported Ops: `and`, `or`, `xor`
DeltaFile
+984-132llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.xor.ll
+960-108llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.and.ll
+960-108llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.or.ll
+12-1llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2,916-3494 files

LLVM/project 8c0d49dllvm/lib/Target/AMDGPU SIISelLowering.cpp

Review comments:
use input wave instruction for checks
DeltaFile
+7-7llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+7-71 files

LLVM/project c9a7d36llvm/lib/Target/AMDGPU SIISelLowering.cpp

Update review comments
DeltaFile
+5-4llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+5-41 files

LLVM/project c9f09d3clang-tools-extra/clang-tidy/bugprone SignedBitwiseCheck.cpp, clang-tools-extra/clang-tidy/hicpp SignedBitwiseCheck.cpp

[clang-tidy] Rename hicpp-signed-bitwise to bugprone-signed-bitwise (#190449)

Part of https://github.com/llvm/llvm-project/issues/183462.

Closes https://github.com/llvm/llvm-project/issues/183465.

---------

Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
DeltaFile
+0-240clang-tools-extra/test/clang-tidy/checkers/hicpp/signed-bitwise.cpp
+240-0clang-tools-extra/test/clang-tidy/checkers/bugprone/signed-bitwise.cpp
+0-198clang-tools-extra/test/clang-tidy/checkers/hicpp/signed-bitwise-standard-types.cpp
+198-0clang-tools-extra/test/clang-tidy/checkers/bugprone/signed-bitwise-standard-types.cpp
+0-102clang-tools-extra/clang-tidy/hicpp/SignedBitwiseCheck.cpp
+102-0clang-tools-extra/clang-tidy/bugprone/SignedBitwiseCheck.cpp
+540-54016 files not shown
+765-73422 files

LLVM/project f32f6a8llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/rvv vp-splice-bf16.ll fixed-vectors-vp-splice-bf16.ll

[RISCV] Enable use of vfslide1up in lowerVPSpliceExperimental for bf16 vectors with Zvfbfa (#192169)
DeltaFile
+97-0llvm/test/CodeGen/RISCV/rvv/vp-splice-bf16.ll
+95-0llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vp-splice-bf16.ll
+7-73llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vp-splice.ll
+0-77llvm/test/CodeGen/RISCV/rvv/vp-splice.ll
+1-1llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+200-1515 files

LLVM/project 5f62baeflang/docs Directives.md, flang/include/flang/Support Fortran.h

[flang][cuda] Fix ignore_tkr(m) to also cover CUDA unified attribute (#192131)

The ignore_tkr(m) directive suppresses CUDA managed attribute checking
on dummy arguments, but it was not covering the unified attribute. This
caused a spurious error when passing a plain host array to a unified
dummy with ignore_tkr(m):
```
error: dummy argument 'x=' has ATTRIBUTES(UNIFIED) but its associated actual argument has no CUDA data attribute
```
Extend the IgnoreTKR::Managed check in AreCompatibleCUDADataAttrs to
accept Unified in addition to Managed and no-attribute.
DeltaFile
+12-1flang/test/Semantics/cuf10.cuf
+4-3flang/docs/Directives.md
+2-2flang/lib/Support/Fortran.cpp
+1-1flang/include/flang/Support/Fortran.h
+19-74 files

LLVM/project 9b8611bopenmp CMakeLists.txt

[OpenMP] Create check-openmp target for device targets (#192175)

offload/cmake/caches/AMDGPUBot.cmake enables
RUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES="openmp". In that
sub-build, check-openmp target doesn't exist and there is build error
`unknown target 'check-openmp'` after 18f63d1375d0, which makes
top-level check-openmp depend on check-openmp-amdgcn-amd-amdhsa.

In openmp, the device targets only call add_subdirectory(device), which
doesn't calls construct_check_openmp_target() and check-openmp target
doesn't exist. `ninja check-openmp-amdgcn-amd-amdhsa` also fails with
the same error before 18f63d1375d0.

Fix by adding construct_check_openmp_target() for device targets as well.

Assisted-by: Claude Sonnet 4.6
DeltaFile
+3-3openmp/CMakeLists.txt
+3-31 files

LLVM/project 1e31171llvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp

[NFC][AMDGPU] clang-format AMDGPUAsmPrinter.cpp (#192176)
DeltaFile
+28-24llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+28-241 files

LLVM/project bfa4de2llvm/lib/Target/SPIRV SPIRVInstructionSelector.cpp, llvm/test/CodeGen/SPIRV/llvm-intrinsics ctpop-vk.ll ctpop.ll

[SPIRV]Implementing PopCount for 16 and 64 bits (#191283)

`OpBitCount` only supports 32bit types. So this patch modifies the
codegen to follow a similar pattern as `firstbithigh` and `firstbitlow`.
On 8 and 16 bits, the parameters are zero-extended to 32 bits. With 64
bits it is bitcasting into 2xi32 types. The logic is adapted to larger
component counts as well.

Fix: https://github.com/llvm/llvm-project/issues/142677

---------

Co-authored-by: Joao Saffran <jderezende at microsoft.com>
DeltaFile
+267-1llvm/lib/Target/SPIRV/SPIRVInstructionSelector.cpp
+146-0llvm/test/CodeGen/SPIRV/llvm-intrinsics/ctpop-vk.ll
+1-0llvm/test/CodeGen/SPIRV/llvm-intrinsics/ctpop.ll
+414-13 files

LLVM/project c365068llvm/include/llvm/DebugInfo/GSYM GsymReader.h, llvm/lib/DebugInfo/GSYM GsymReader.cpp GsymCreator.cpp

Make GSYM 64 bit safe and add a new version 2 of the GSYM files (#190353)

# Motivation

GSYM files are approaching the need for 64 bit offsets in the GSYM
files. We also want to add more global data to GSYM files. Right now the
GSYM file format is:
```
Header
AddressOffsets
AddressInfoOffsets
FileTable
StringTable
FunctionInfos
```
The location of the `AddressOffsets`, `AddressInfoOffsets` and
`FileTable` are always immediately following the Header. The
`StringTable` is pointed to by the header and the header uses 32 bit
integers for the string table file offset and file size. The

    [74 lines not shown]
DeltaFile
+956-391llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp
+1,135-0llvm/unittests/DebugInfo/GSYM/GSYMV2Test.cpp
+289-230llvm/lib/DebugInfo/GSYM/GsymReader.cpp
+72-148llvm/lib/DebugInfo/GSYM/GsymCreator.cpp
+157-48llvm/include/llvm/DebugInfo/GSYM/GsymReader.h
+166-0llvm/lib/DebugInfo/GSYM/GsymCreatorV2.cpp
+2,775-81739 files not shown
+4,196-95945 files

LLVM/project 4800482llvm/test/CodeGen/RISCV rv64p.ll

[RISCV] Add test showing constant materialization using pli.h/pli.w+srli/slli. NFC (#192159)
DeltaFile
+36-0llvm/test/CodeGen/RISCV/rv64p.ll
+36-01 files

LLVM/project 4994b36lld/test/ELF loongarch-emit-relocs-mark-la.s loongarch-abs64.s

update test

Created using spr 1.3.7
DeltaFile
+0-22lld/test/ELF/loongarch-emit-relocs-mark-la.s
+7-14lld/test/ELF/loongarch-abs64.s
+7-362 files

LLVM/project 089e6c3llvm/test/CodeGen/AArch64 ragreedy-csr.ll, llvm/test/CodeGen/PowerPC ctrloops-pseudo.ll sms-cpy-1.ll

fix

Created using spr 1.3.7
DeltaFile
+116-111llvm/test/CodeGen/AArch64/ragreedy-csr.ll
+37-34llvm/test/CodeGen/X86/lsr-addrecloops.ll
+21-30llvm/test/CodeGen/PowerPC/ctrloops-pseudo.ll
+19-18llvm/test/CodeGen/PowerPC/sms-cpy-1.ll
+4-4llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-invalid-ptr-extend.ll
+2-2llvm/test/Transforms/LoopStrengthReduce/X86/sibling-loops.ll
+199-1996 files not shown
+209-20712 files

LLVM/project efef0d2llvm/test/CodeGen/AArch64 ragreedy-csr.ll, llvm/test/CodeGen/PowerPC ctrloops-pseudo.ll sms-cpy-1.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+116-111llvm/test/CodeGen/AArch64/ragreedy-csr.ll
+37-34llvm/test/CodeGen/X86/lsr-addrecloops.ll
+21-30llvm/test/CodeGen/PowerPC/ctrloops-pseudo.ll
+19-18llvm/test/CodeGen/PowerPC/sms-cpy-1.ll
+4-4llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-invalid-ptr-extend.ll
+2-2llvm/test/CodeGen/PowerPC/P10-stack-alignment.ll
+199-1996 files not shown
+209-20712 files

LLVM/project 7933be2llvm/test/CodeGen/AArch64 ragreedy-csr.ll, llvm/test/CodeGen/PowerPC ctrloops-pseudo.ll sms-cpy-1.ll

fix

Created using spr 1.3.7
DeltaFile
+116-111llvm/test/CodeGen/AArch64/ragreedy-csr.ll
+37-34llvm/test/CodeGen/X86/lsr-addrecloops.ll
+21-30llvm/test/CodeGen/PowerPC/ctrloops-pseudo.ll
+19-18llvm/test/CodeGen/PowerPC/sms-cpy-1.ll
+4-4llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-invalid-ptr-extend.ll
+2-2llvm/test/CodeGen/PowerPC/P10-stack-alignment.ll
+199-1996 files not shown
+209-20712 files

LLVM/project 9f6f26fllvm/lib/Transforms/Utils LoopUtils.cpp

[LSR][IndVarSimplify] Update assertion message (#192168)

rewriteLoopExitValues is called by both LSR and IndVarSimplify. Update
the assertion message to match this reality rather than only mentioning
IndVarSimplify.
DeltaFile
+1-1llvm/lib/Transforms/Utils/LoopUtils.cpp
+1-11 files

LLVM/project c1fc739libclc CMakeLists.txt

[libclc] Only add test folder when LLVM_INCLUDE_TESTS is ON (#191948)
DeltaFile
+3-1libclc/CMakeLists.txt
+3-11 files

LLVM/project 19d1b34libclc CMakeLists.txt

[libclc][CMake][NFC] Delete dead code LLVM_PACKAGE_VERSION (#191943)

Use of LLVM_PACKAGE_VERSION in AddLibclc.cmake was dropped by e20ae16ce672.
DeltaFile
+0-7libclc/CMakeLists.txt
+0-71 files

LLVM/project 18f63d1.ci compute_projects_test.py monolithic-linux.sh, libclc README.md

[runtimes] Aggregate per-target runtime checks in top-level check-${runtime_name} (#191743)

When a per-target runtime build exports a
check-${runtime_name}-${target} proxy, make the top-level
check-${runtime_name} target depend on it, creating
check-${runtime_name} on demand (it may not exist).

This applies regardless of whether the runtime comes from the default
LLVM_ENABLE_RUNTIMES set or from a target-specific
RUNTIMES_<target>_LLVM_ENABLE_RUNTIMES override.

This allows a single `check-${runtime_name}` command to trigger all
per-target tests for that runtime.
DeltaFile
+13-2libclc/README.md
+5-5.ci/compute_projects_test.py
+9-0llvm/runtimes/CMakeLists.txt
+1-1.ci/monolithic-linux.sh
+1-1.ci/monolithic-windows.sh
+1-1.ci/compute_projects.py
+30-106 files

LLVM/project 8265f79llvm/test/CodeGen/AArch64 itofp-bf16.ll, llvm/test/CodeGen/RISCV/rvv vfma-vp.ll

Merge branch 'main' into users/rampitec/gfx1250-true16-feature-only
DeltaFile
+4,582-5,914llvm/test/CodeGen/RISCV/rvv/vfma-vp.ll
+6,877-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-sve-instructions.s
+5,336-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-writeback.s
+0-4,851llvm/test/tools/llvm-mca/RISCV/SiFiveX390/vector-fp.s
+2,832-1,746llvm/test/CodeGen/AArch64/itofp-bf16.ll
+4,526-0llvm/test/tools/llvm-mca/RISCV/SiFiveX390/rvv/arithmetic.test
+24,153-12,5114,504 files not shown
+300,314-122,0814,510 files

LLVM/project 348061dlldb/include/lldb/Target Process.h, lldb/source/Target Thread.cpp Process.cpp

[lldb] Fix deadlock when scripted frame providers load on private state thread (#191913)

Frame providers are an overlay on top of the parent reality (the
unwinder stack). The private state thread (PST) manages the stop of that
parent reality, so the correct view for PST logic IS the parent --
providers should only be applied once the process has settled and
clients query the stopped state.

When a scripted breakpoint's `was_hit` callback calls
`EvaluateExpression` on the PST, `RunThreadPlan` spawns an override PST
(Thread B) and reassigns `m_current_private_state_thread_sp` to it. Two
threads then need to see parent frames:

- Thread B (override PST): processes stop events via
`HandlePrivateEvent` -> `ShouldStop` -> `GetStackFrameList`. If it loads
a provider, the provider's Python code can acquire locks held by Thread
A, causing a deadlock.

- Thread A (original PST): processes events inline via

    [25 lines not shown]
DeltaFile
+119-0lldb/test/API/functionalities/scripted_frame_provider/was_hit_deadlock/TestWasHitWithFrameProviderDeadlock.py
+55-17lldb/source/Target/Thread.cpp
+47-0lldb/test/API/functionalities/scripted_frame_provider/was_hit_deadlock/bkpt_resolver.py
+41-0lldb/test/API/functionalities/scripted_frame_provider/was_hit_deadlock/frame_provider.py
+29-3lldb/include/lldb/Target/Process.h
+22-3lldb/source/Target/Process.cpp
+313-232 files not shown
+335-238 files

LLVM/project 05bd49cbolt/lib/Core DIEBuilder.cpp, bolt/test/X86 dwarf5-locexpr-addrx.s

[BOLT] Fix DW_FORM_implicit_const values lost during DWARF5 rewriting

Summary:
Fix two bugs in DIEBuilder that caused DW_FORM_implicit_const values to
be zeroed out when rewriting DWARF5 debug sections (--update-debug-sections).

1. In constructDIEFast(), DWARFFormValue was constructed with just the
   form code, leaving the value at 0. For DW_FORM_implicit_const,
   extractValue() is a no-op since the value is expected to be pre-set.
   Fix: use AttrSpec.getFormValue() which initializes the value from the
   abbreviation table.

2. In assignAbbrev(), AddAttribute(Attr.getAttribute(), Attr.getForm())
   used the two-argument overload which discards the implicit_const
   value. Fix: use AddAttribute(Attr) to copy the full DIEAbbrevData.

Fixes https://github.com/llvm/llvm-project/issues/192084
DeltaFile
+2-2bolt/lib/Core/DIEBuilder.cpp
+2-0bolt/test/X86/dwarf5-locexpr-addrx.s
+4-22 files

LLVM/project 1a39343llvm/lib/Target/AMDGPU AMDGPULowerModuleLDSPass.cpp AMDGPULowerExecSync.cpp, llvm/test/CodeGen/AMDGPU lower-module-lds-link-time-global-scope.ll lower-module-lds-link-time-classify.ll

[AMDGPU] Add object linking support for LDS and named barrier lowering in the middle end (#191645)

This is the first patch in a series introducing object linking support
for AMDGPU.

This PR adds the `-amdgpu-enable-object-linking` flag to enable object
linking in the backend. It also updates the `AMDGPULowerModuleLDSPass`
and `AMDGPULowerExecSync` passes to support lowering LDS and named
barrier globals when object linking is enabled.
DeltaFile
+167-0llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp
+125-0llvm/test/CodeGen/AMDGPU/lower-module-lds-link-time-global-scope.ll
+73-0llvm/test/CodeGen/AMDGPU/lower-module-lds-link-time-classify.ll
+50-0llvm/test/CodeGen/AMDGPU/lower-module-lds-link-time-internal-multi-user.ll
+44-0llvm/lib/Target/AMDGPU/AMDGPULowerExecSync.cpp
+38-0llvm/test/CodeGen/AMDGPU/lower-module-lds-link-time-internal-func.ll
+497-03 files not shown
+540-09 files

LLVM/project 59033e1clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/non-overloaded vpaire.c, clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/overloaded vpairo.c vpaire.c

Merge branch 'users/ziqingluo/PR-172429193-pre-2' into users/ziqingluo/PR-172429193-2-split-1
DeltaFile
+6,877-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-sve-instructions.s
+5,336-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-writeback.s
+3,167-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Ultra-neon-instructions.s
+2,723-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/overloaded/vpairo.c
+2,723-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/overloaded/vpaire.c
+2,723-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvzip/policy/non-overloaded/vpaire.c
+23,549-0674 files not shown
+87,632-9,781680 files