LLVM/project 615a7e0mlir/lib/Conversion/MathToSPIRV MathToSPIRV.cpp, mlir/test/Conversion/MathToSPIRV math-to-opencl-spirv.mlir

[mlir][SPIR-V] Convert math.fpowi to spirv.CL.pown (#196701)
DeltaFile
+21-1mlir/lib/Conversion/MathToSPIRV/MathToSPIRV.cpp
+14-0mlir/test/Conversion/MathToSPIRV/math-to-opencl-spirv.mlir
+35-12 files

LLVM/project cb5d076clang/docs ClangFormatStyleOptions.rst, clang/include/clang/Format Format.h

[clang-format] Add BreakFunctionDeclarationParameters option. (#196567)

Adds an option the break function declaration parameters, always putting
them on the next line after the function opening parentheses.

This is an equivalent of `BreakFunctionDefinitionParameters`, but for
function declarations.

---------

Co-authored-by: Lukas Jirkovsky <lukas.jirkovsky at aveco.com>
DeltaFile
+27-0clang/unittests/Format/FormatTest.cpp
+16-0clang/include/clang/Format/Format.h
+15-0clang/docs/ClangFormatStyleOptions.rst
+9-0clang/unittests/Format/AlignBracketsTest.cpp
+6-0clang/lib/Format/TokenAnnotator.cpp
+3-0clang/lib/Format/Format.cpp
+76-02 files not shown
+79-08 files

LLVM/project a2d17c5clang/test/Preprocessor predefined-arch-macros.c

clang: Fix using -march=amdgcn in some r600 run lines
DeltaFile
+2-2clang/test/Preprocessor/predefined-arch-macros.c
+2-21 files

LLVM/project 29a9658clang/lib/Driver/ToolChains AMDGPU.cpp

clang/AMDGPU: Use all_equal instead of building a temporary set

Addresses comment on #196373
DeltaFile
+1-2clang/lib/Driver/ToolChains/AMDGPU.cpp
+1-21 files

LLVM/project aa68a9cllvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-instructions.ll, llvm/test/CodeGen/AMDGPU load-atomic-global.ll

Merge branch 'main' into users/kparzysz/control-driver-warnings
DeltaFile
+4,634-367llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+3,071-1,257llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+1,660-649llvm/test/CodeGen/AArch64/bf16-instructions.ll
+1,440-725llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+1,608-0llvm/test/MC/AMDGPU/gfx13_asm_vop3p.s
+1,246-0llvm/test/CodeGen/AMDGPU/load-atomic-global.ll
+13,659-2,9981,703 files not shown
+48,886-20,0041,709 files

LLVM/project 882122fclang/lib/Driver SanitizerArgs.cpp, clang/lib/Driver/ToolChains AMDGPU.cpp AMDGPU.h

clang: Refactor handling of offload sanitizer arguments

Previously the AMDGPU toolchains hackily handled -fsanitize arguments.
They would lie and report that all host side sanitizers are available,
then TranslateArgs would filter out the device side cases that do not
work, providing diagnostics for the skipped cases. Move that logic
into the base sanitizer argument parsing.

This makes the produced diagnostics more consistent. Previously we
would get repeated warnings when a sanitizer is fully unsupported
by amdgpu, which should now be once for the toolchain. These could
be further improved; we're printing the specific field of -fsanitize
in more cases where it could be skipped. In other cases we have the
opposite problem, where we aren't reporting the exact sanitizer
from the -f flag in the case that depends on a subtarget feature.

This will help fix other broken target specific flag forwarding bugs
in the future.

Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
DeltaFile
+56-47clang/lib/Driver/ToolChains/AMDGPU.cpp
+85-11clang/lib/Driver/SanitizerArgs.cpp
+7-75clang/lib/Driver/ToolChains/AMDGPU.h
+21-24clang/lib/Driver/ToolChains/HIPAMD.cpp
+17-21clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+14-14clang/test/Driver/hip-sanitize-options.hip
+200-1928 files not shown
+261-20714 files

LLVM/project ed5b3edclang/lib/Driver SanitizerArgs.cpp, clang/lib/Driver/ToolChains AMDGPU.cpp AMDGPU.h

clang: Refactor handling of offload sanitizer arguments

Previously the AMDGPU toolchains hackily handled -fsanitize arguments.
They would lie and report that all host side sanitizers are available,
then TranslateArgs would filter out the device side cases that do not
work, providing diagnostics for the skipped cases. Move that logic
into the base sanitizer argument parsing.

This makes the produced diagnostics more consistent. Previously we
would get repeated warnings when a sanitizer is fully unsupported
by amdgpu, which should now be once for the toolchain. These could
be further improved; we're printing the specific field of -fsanitize
in more cases where it could be skipped. In other cases we have the
opposite problem, where we aren't reporting the exact sanitizer
from the -f flag in the case that depends on a subtarget feature.

This will help fix other broken target specific flag forwarding bugs
in the future.

Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
DeltaFile
+56-47clang/lib/Driver/ToolChains/AMDGPU.cpp
+85-11clang/lib/Driver/SanitizerArgs.cpp
+7-75clang/lib/Driver/ToolChains/AMDGPU.h
+21-24clang/lib/Driver/ToolChains/HIPAMD.cpp
+17-21clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+14-14clang/test/Driver/hip-sanitize-options.hip
+200-1928 files not shown
+261-20714 files

LLVM/project 2f49369llvm/lib/Target/AMDGPU VOP1Instructions.td VOPInstructions.td

[AMDGPU] Add VOP1 DPP8 pseudo infrastructure

Add VOP_DPP8_Pseudo/VOP1_DPP8_Pseudo classes for DPP8 instructions, similar to
the existing VOP_DPP_Pseudo/VOP1_DPP_Pseudo pattern.
DeltaFile
+18-16llvm/lib/Target/AMDGPU/VOP1Instructions.td
+25-0llvm/lib/Target/AMDGPU/VOPInstructions.td
+43-162 files

LLVM/project 2caea40libcxx/docs/Status Cxx2cIssues.csv, libcxx/include/__memory unique_ptr.h

[libc++] LWG4324: `unique_ptr<void>::operator*` is not SFINAE-friendly (#190919)

---------

Co-authored-by: Hristo Hristov <zingam at outlook.com>
DeltaFile
+13-0libcxx/test/std/utilities/smartptr/unique.ptr/unique.ptr.class/unique.ptr.observers/dereference.single.pass.cpp
+7-0libcxx/include/__memory/unique_ptr.h
+1-1libcxx/docs/Status/Cxx2cIssues.csv
+21-13 files

LLVM/project 3a7d430libcxx/include/__memory uninitialized_algorithms.h, libcxx/test/libcxx/memory uninitialized_allocator_copy_template_op_assign.pass.cpp

[libc++] Require the exact assignment expression to be trivial in __uninitialized_allocator_copy_impl

__uninitialized_allocator_copy_impl has an optimization that replaces allocator_traits::construct with std::copy for raw pointer ranges when the element type is trivially copy constructible and trivially copy assignable.

The copy-assignment trait only checks whether assignment from const T& is trivial. That is weaker than the expression used by std::copy, which evaluates *out = *in. If overload resolution selects a different non-trivial assignment operator for that expression, std::copy can call that operator on uninitialized storage.

Check is_trivially_assignable<_Out&, _In&> instead. This matches the assignment expression used by std::copy, preserves the optimized path when that assignment is actually trivial, and falls back to placement construction otherwise.

Add a regression test with a type whose defaulted copy assignment is trivial but whose templated assignment operator is selected for non-const lvalue sources.

Tested with:
~/llvm-project/build-libcxx-fresh/bin/llvm-lit ~/llvm-project/libcxx/test/libcxx/memory/uninitialized_allocator_copy_template_op_assign.pass.cpp ~/llvm-project/libcxx/test/libcxx/memory/uninitialized_allocator_copy.pass.cpp -q
DeltaFile
+75-0libcxx/test/libcxx/memory/uninitialized_allocator_copy_template_op_assign.pass.cpp
+1-1libcxx/include/__memory/uninitialized_algorithms.h
+76-12 files

LLVM/project c2f7e98llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/X86 reduce-load-width-freeze.ll

[SelectionDAG] Don't convert sextload to zextload through a multi-use freeze (#196700)

Resolves #196590.

The patch https://github.com/llvm/llvm-project/pull/189317 to teach
DAGCombiner to look through freeze incorrectly introduce a miscompile of
sext -> zext. This resolves resolves the miscompile.
DeltaFile
+29-0llvm/test/CodeGen/X86/reduce-load-width-freeze.ll
+3-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+32-12 files

LLVM/project 0556796llvm/lib/CodeGen/SelectionDAG ExpandMulByConstant.cpp, llvm/test/CodeGen/Lanai constant_multiply.ll

[TargetLowering] Add target-independent mul-by-constant expansion algorithm
DeltaFile
+176-379llvm/test/CodeGen/RISCV/mul-expand.ll
+307-0llvm/lib/CodeGen/SelectionDAG/ExpandMulByConstant.cpp
+130-164llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll
+88-152llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll
+73-128llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll
+82-78llvm/test/CodeGen/Lanai/constant_multiply.ll
+856-90119 files not shown
+1,357-1,40925 files

LLVM/project 6004c17clang-tools-extra/clang-tidy/modernize UseStdBitCheck.cpp, clang-tools-extra/docs/clang-tidy/checks/modernize use-std-bit.rst

[clang-tidy] Correct `std::has_one_bit` to `std::has_single_bit` in `modernize-use-std-bit` (#196721)

There isn't `std::has_one_bit` in standard library, the function checks
if a number is an integral power of 2 is `std::has_single_bit`.

https://en.cppreference.com/cpp/header/bit
DeltaFile
+25-25clang-tools-extra/test/clang-tidy/checkers/modernize/use-std-bit.cpp
+6-6clang-tools-extra/docs/clang-tidy/checks/modernize/use-std-bit.rst
+6-5clang-tools-extra/clang-tidy/modernize/UseStdBitCheck.cpp
+37-363 files

LLVM/project 64e0e10llvm/lib/CodeGen/SelectionDAG ExpandMulByConstant.cpp, llvm/test/CodeGen/Lanai constant_multiply.ll

[TargetLowering] Add target-independent mul-by-constant expansion algorithm
DeltaFile
+176-379llvm/test/CodeGen/RISCV/mul-expand.ll
+307-0llvm/lib/CodeGen/SelectionDAG/ExpandMulByConstant.cpp
+130-164llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll
+88-152llvm/test/CodeGen/RISCV/ctz_zero_return_test.ll
+73-128llvm/test/CodeGen/RISCV/srem-seteq-illegal-types.ll
+82-78llvm/test/CodeGen/Lanai/constant_multiply.ll
+856-90119 files not shown
+1,358-1,40925 files

LLVM/project f1051b2llvm/test/CodeGen/AVR constant_multiply.ll, llvm/test/CodeGen/Lanai constant_multiply.ll

pre-commit tests
DeltaFile
+334-0llvm/test/CodeGen/Lanai/constant_multiply.ll
+166-0llvm/test/CodeGen/MSP430/constant_multiply.ll
+148-0llvm/test/CodeGen/SPARC/constant_multiply.ll
+99-0llvm/test/CodeGen/AVR/constant_multiply.ll
+747-04 files

LLVM/project a2942d4llvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/PhaseOrdering/X86 horizontal-reduce-umin.ll horizontal-reduce-umax.ll

[VectorCombine] foldShuffleChainsToReduce - add support for partial vector reductions (#195119)

Extend foldShuffleChainsToReduce to recognize partial reduction patterns where only a subvector of the full vector is being reduced.

For example, a <16 x i16> vector where the shuffle chain only reduces the lower 8 elements can now be folded into:
shufflevector (extract lower <8 x i16>) + vector.reduce.smax

The detection works by noticing when the bottom-up walk through the
shuffle/op chain ends before consuming the full vector. The number of
levels visited determines the subvector size (2^levels), and an
extract_subvector + scalar reduction replaces the original chain when
profitable.

Fixes #194617
DeltaFile
+50-0llvm/test/Transforms/VectorCombine/fold-shuffle-chains-to-reduce.ll
+8-32llvm/test/Transforms/PhaseOrdering/X86/horizontal-reduce-umin.ll
+8-32llvm/test/Transforms/PhaseOrdering/X86/horizontal-reduce-umax.ll
+8-32llvm/test/Transforms/PhaseOrdering/X86/horizontal-reduce-smin.ll
+8-32llvm/test/Transforms/PhaseOrdering/X86/horizontal-reduce-smax.ll
+33-6llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+115-1346 files

LLVM/project ae3d770llvm/lib/Target/BPF BPFISelLowering.cpp BPFMIPeephole.cpp, llvm/test/CodeGen/BPF many_args3.ll many_args4.ll

[BPF] Support Stack Arguments (#189060)

Currently, bpf program and kfunc only support 5 register parameters. As
bpf community and use cases keep expanding, there are some need to
extend 5 register parameters by allocating additional parameters on
stack. There are two main use cases here:
1. Currently kfunc is limited to 5 register parameters. In some special
situation, people may want to have more than 5 parameters. One of
example is for sched_ext.
2. Allowing more stack parameters can make bpf prog writer easier since
they do not need to carefully limit the number of parameters for their
programs.

The following is the high-level design:
- Use bpf register R11 as the frame pointer to stack parameters. This is
to avoid mixing stacks due to R10.
  - Stack parameters must be after 5 register parameters.
- All parameters should be at most 16 bytes as ByVal parameters are not
supported.

    [43 lines not shown]
DeltaFile
+199-0llvm/test/CodeGen/BPF/many_args3.ll
+64-35llvm/lib/Target/BPF/BPFISelLowering.cpp
+65-0llvm/test/CodeGen/BPF/many_args4.ll
+60-0llvm/lib/Target/BPF/BPFMIPeephole.cpp
+36-0llvm/test/CodeGen/BPF/many_args8.ll
+33-0llvm/test/CodeGen/BPF/many_args7.ll
+457-3510 files not shown
+599-4816 files

LLVM/project 062ddf5llvm/lib/Target/RISCV RISCVInstrInfoZvvmm.td RISCVInstrInfoZvvm.td

[RISCV][NFC] Rename `Zvvmm` instruction file to `Zvvm` (#196692)

Renames `RISCVInstrInfoZvvmm.td` to `RISCVInstrInfoZvvm.td` so `Zvvmm`
and `Zvvfmm` share the same IME instruction file according to the spec.
And all future instructions from the `Zvvm family` will be placed here
too.

This PR is required for reviewing #196486 in order to make GitHub show
the diff correcrly.
DeltaFile
+0-37llvm/lib/Target/RISCV/RISCVInstrInfoZvvmm.td
+37-0llvm/lib/Target/RISCV/RISCVInstrInfoZvvm.td
+1-1llvm/lib/Target/RISCV/RISCVInstrInfo.td
+38-383 files

LLVM/project e78381dllvm/lib/Target/AArch64/GISel AArch64LegalizerInfo.cpp, llvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-instructions.ll

[AArch64][GlobalISel] Promote BF16 G_FCMP (#196093)

This adds bf16 legalization for floating point compares.
DeltaFile
+955-205llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+758-208llvm/test/CodeGen/AArch64/bf16-instructions.ll
+432-127llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+6-1llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+2,151-5414 files

LLVM/project 5022a16clang-tools-extra/clang-tidy/google ExplicitConstructorCheck.cpp, clang-tools-extra/clang-tidy/misc ExplicitConstructorCheck.cpp

[clang-tidy] Migrate explicit-constructor check from google to misc and add relative aliases (#194807)

Fixes #126032
DeltaFile
+188-0clang-tools-extra/test/clang-tidy/checkers/misc/explicit-constructor.cpp
+0-188clang-tools-extra/test/clang-tidy/checkers/google/explicit-constructor.cpp
+139-0clang-tools-extra/clang-tidy/misc/ExplicitConstructorCheck.cpp
+0-139clang-tools-extra/clang-tidy/google/ExplicitConstructorCheck.cpp
+59-0clang-tools-extra/test/clang-tidy/checkers/misc/explicit-constructor-cxx20.cpp
+0-59clang-tools-extra/test/clang-tidy/checkers/google/explicit-constructor-cxx20.cpp
+386-38615 files not shown
+513-48221 files

LLVM/project ebf6a41clang-tools-extra/clangd .clang-format-ignore

[CI] Ignore TidyFastChecks.inc for formatter CI. NFC. (#196682)

`TidyFastChecks.inc` is generated and its contents should not be checked
by clang-format CI workflow. Add a local `.clang-format-ignore` entry so
the PR formatting check does not report diffs for this file.

Related run:
https://github.com/llvm/llvm-project/pull/194516#issuecomment-4332061836
DeltaFile
+2-0clang-tools-extra/clangd/.clang-format-ignore
+2-01 files

LLVM/project d3a4bb0clang-tools-extra/clang-tidy/modernize UseNodiscardCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Avoid `use-nodiscard` false positives for class templates (#196661)

Do not suggest adding `[[nodiscard]]` to functions returning a class
template specialization whose primary template is already marked
`[[nodiscard]]`.

Class template specializations do not carry the `[[nodiscard]]`
attribute on their own declarations, so `modernize-use-nodiscard`
previously missed this case and emitted redundant diagnostics for return
types such as:
```cpp
template <class T>
struct [[nodiscard]] Result;

Result<int> f() const;
```
Fixes #163425.
DeltaFile
+20-0clang-tools-extra/test/clang-tidy/checkers/modernize/use-nodiscard.cpp
+5-0clang-tools-extra/clang-tidy/modernize/UseNodiscardCheck.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+30-03 files

LLVM/project 7c9f1d2clang/www cxx_dr_status.html

[clang] Update `cxx_dr_status.html` (#196702)

Updates from 2026-05-08 CWG telecon.
DeltaFile
+48-13clang/www/cxx_dr_status.html
+48-131 files

LLVM/project e361f28llvm/include/llvm/ObjectYAML BBAddrMapYAML.h ELFYAML.h, llvm/lib/ObjectYAML BBAddrMapYAML.cpp ELFYAML.cpp

[ObjectYAML][NFC] Extract BBAddrMap YAML types into shared namespace (#196019)

Move BBAddrMapEntry and PGOAnalysisMapEntry out of namespace ELFYAML
into a new format-agnostic namespace BBAddrMapYAML so that COFF
YAML support can reuse the same schema and MappingTraits.
DeltaFile
+132-0llvm/include/llvm/ObjectYAML/BBAddrMapYAML.h
+3-93llvm/include/llvm/ObjectYAML/ELFYAML.h
+73-0llvm/lib/ObjectYAML/BBAddrMapYAML.cpp
+0-51llvm/lib/ObjectYAML/ELFYAML.cpp
+6-5llvm/tools/obj2yaml/elf2yaml.cpp
+4-4llvm/lib/ObjectYAML/ELFEmitter.cpp
+218-1531 files not shown
+219-1537 files

LLVM/project d55e108llvm/lib/Target/AArch64 AArch64DeadRegisterDefinitionsPass.cpp

[AArch64][NFC] Remove unused TRI member from class (#184363)

I’ve removed the TRI member and its initialization, leaving only MRI and
TII as the stored pointers.

---------

Co-authored-by: Benjamin Maxwell <benjamin.maxwell at arm.com>
DeltaFile
+0-3llvm/lib/Target/AArch64/AArch64DeadRegisterDefinitionsPass.cpp
+0-31 files

LLVM/project 66aa157llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/X86 reduce-load-width-freeze.ll

[SelectionDAG] Don't convert sextload to zextload through a multi-use freeze
DeltaFile
+5-4llvm/test/CodeGen/X86/reduce-load-width-freeze.ll
+3-1llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+8-52 files

LLVM/project 5e7100dllvm/test/CodeGen/X86 reduce-load-width-freeze.ll

test
DeltaFile
+28-0llvm/test/CodeGen/X86/reduce-load-width-freeze.ll
+28-01 files

LLVM/project 89f9ebdllvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-v4-instructions.ll, llvm/test/CodeGen/AArch64/Atomics aarch64-atomicrmw-v8a_fp.ll aarch64-atomicrmw-lsfe.ll

[AArch64][GlobalISel] Enable BF16 legalization for fadd and friends. (#196081)

This enabled bf16 promotion for the following operations in GISel,
promoting them to f32 and truncating the result back:
G_FADD, G_FSUB, G_FMUL, G_FDIV, G_FMA, G_FSQRT, G_FMAXNUM, G_FMINNUM,
G_FMAXIMUM, G_FMINIMUM, G_FCEIL, G_FFLOOR, G_FRINT, G_FNEARBYINT,
G_INTRINSIC_TRUNC, G_INTRINSIC_ROUND, G_INTRINSIC_ROUNDEVEN
DeltaFile
+2,062-1,026llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+975-581llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+824-404llvm/test/CodeGen/AArch64/bf16-instructions.ll
+420-240llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-v8a_fp.ll
+195-90llvm/test/CodeGen/AArch64/Atomics/aarch64-atomicrmw-lsfe.ll
+34-34llvm/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir
+4,510-2,3751 files not shown
+4,516-2,3767 files

LLVM/project 102ac85llvm/lib/Transforms/InstCombine InstructionCombining.cpp, llvm/test/Transforms/InstCombine fold-multi-use-select-packed-constants.ll

address review comments

Co-Authored-By: arsenm <arsenm2 at gmail.com>
DeltaFile
+10-10llvm/test/Transforms/InstCombine/fold-multi-use-select-packed-constants.ll
+5-9llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+15-192 files

LLVM/project 0da8649llvm/lib/Transforms/InstCombine InstructionCombining.cpp, llvm/test/CodeGen/AMDGPU amdgpu-simplify-libcall-pow.ll

[InstCombine] Fold binop into multi-use select when one select arm and the other operand are constant
DeltaFile
+48-48llvm/test/CodeGen/AMDGPU/amdgpu-simplify-libcall-pow.ll
+21-2llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+2-7llvm/test/Transforms/InstCombine/pr80597.ll
+2-7llvm/test/Transforms/InstCombine/pr72433.ll
+2-6llvm/test/Transforms/InstCombine/fold-multi-use-select-packed-constants.ll
+1-4llvm/test/Transforms/InstCombine/extractelement.ll
+76-741 files not shown
+77-757 files