LLVM/project 5d03beellvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp

[DAG] canCreateUndefOrPoison - out of range vector insert/extract element indices only generate poison (#196720)

Matches ValueTracking / GISel implementations - although testing options are limited until DAG has actual uses of UndefPoisonKind::UndefOnly
DeltaFile
+7-4llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+7-41 files

LLVM/project 274185fclang/include/clang/Sema Template.h, clang/lib/Sema SemaCXXScopeSpec.cpp SemaTemplateInstantiateDecl.cpp

fixup
DeltaFile
+100-126clang/lib/Sema/SemaCXXScopeSpec.cpp
+16-31clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+15-12clang/test/AST/ast-dump-templates-pattern.cpp
+8-8clang/test/SemaTemplate/instantiate-scope.cpp
+10-0clang/test/SemaTemplate/instantiation-dependence.cpp
+2-4clang/include/clang/Sema/Template.h
+151-1811 files not shown
+156-1827 files

LLVM/project e785becllvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-v4-instructions.ll, llvm/test/CodeGen/RISCV/rvv fixed-vectors-reduction-fp.ll

Rebase, address comments

Created using spr 1.3.7
DeltaFile
+7,584-740llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+6,873-0llvm/test/tools/llvm-mca/AArch64/Cortex/C1Premium-sve-instructions.s
+4,634-367llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+4,174-657llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+2,928-1,388llvm/test/CodeGen/X86/vector-reduce-smin.ll
+2,924-1,389llvm/test/CodeGen/X86/vector-reduce-smax.ll
+29,117-4,5415,248 files not shown
+204,618-66,7255,254 files

LLVM/project 2d896aaclang/test/SemaCXX GH195416.cpp

[clang][NFC] Actually add the testcase for #195416
DeltaFile
+11-0clang/test/SemaCXX/GH195416.cpp
+11-01 files

LLVM/project b9f6e39clang/test/SemaCXX GH195416.cpp

Actually add the failing testcase
DeltaFile
+11-0clang/test/SemaCXX/GH195416.cpp
+11-01 files

LLVM/project 18c2de0clang/lib/Driver/ToolChains AMDGPU.h PS4CPU.cpp

clang: Add BoundArch/OffloadKind argument to getSupportedSanitizers

Currently the AMDGPU HIP and OpenMP toolchains falsely report
all host sanitizers are supported, and then go out of their way
to skip forwarding those to the device compiles. Add an offloading
kind argument so that in the future this can be handled in one
place in the base toolchain.

Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
DeltaFile
+11-2clang/lib/Driver/ToolChains/AMDGPU.h
+8-4clang/lib/Driver/ToolChains/PS4CPU.cpp
+6-3clang/lib/Driver/ToolChains/HIPSPV.cpp
+6-2clang/lib/Driver/ToolChains/PS4CPU.h
+5-2clang/lib/Driver/ToolChains/OpenBSD.cpp
+5-2clang/lib/Driver/ToolChains/Cuda.cpp
+41-1540 files not shown
+190-7246 files

LLVM/project 2aec801clang/lib/Driver SanitizerArgs.cpp, clang/lib/Driver/ToolChains AMDGPU.cpp AMDGPU.h

clang: Refactor handling of offload sanitizer arguments

Previously the AMDGPU toolchains hackily handled -fsanitize arguments.
They would lie and report that all host side sanitizers are available,
then TranslateArgs would filter out the device side cases that do not
work, providing diagnostics for the skipped cases. Move that logic
into the base sanitizer argument parsing.

This makes the produced diagnostics more consistent. Previously we
would get repeated warnings when a sanitizer is fully unsupported
by amdgpu, which should now be once for the toolchain. These could
be further improved; we're printing the specific field of -fsanitize
in more cases where it could be skipped. In other cases we have the
opposite problem, where we aren't reporting the exact sanitizer
from the -f flag in the case that depends on a subtarget feature.

This will help fix other broken target specific flag forwarding bugs
in the future.

Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
DeltaFile
+56-47clang/lib/Driver/ToolChains/AMDGPU.cpp
+85-11clang/lib/Driver/SanitizerArgs.cpp
+7-75clang/lib/Driver/ToolChains/AMDGPU.h
+21-24clang/lib/Driver/ToolChains/HIPAMD.cpp
+17-21clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
+14-14clang/test/Driver/hip-sanitize-options.hip
+200-1928 files not shown
+261-20714 files

LLVM/project 254259bclang/docs ReleaseNotes.rst, clang/lib/Sema SemaDecl.cpp

Revert "Avoid assert in substqualifier (#182707)" (#196755)

This reverts commit e2def106757534b07a2d3ff15ddd48e14b69a66d.
DeltaFile
+0-12clang/test/SemaTemplate/GH176152.cpp
+1-3clang/lib/Sema/SemaDecl.cpp
+0-1clang/docs/ReleaseNotes.rst
+1-163 files

LLVM/project e07d245llvm/include/llvm/Support SourceMgr.h, llvm/lib/MC/MCParser AsmParser.cpp

[MCParser] .incbin: Don't retain the buffer, don't require NUL termination (#196696)

processIncbinFile uses SourceMgr::AddIncludeFile, which

* sets `RequiresNullTerminator=true` and disable `mmap` when the file
size is a multiple of the page size,
* and unnecessarily retains the throwaway buffer in `Buffers`.

Switch to OpenIncludeFile so the buffer is freed when processIncbinFile
returns, and pass RequiresNullTerminator=false. The buffer is consumed
only by emitBytes; the lexer never scans it, so it does not need a
trailing '\0' (different from #154972). Without that requirement,
MemoryBuffer mmaps the file and RSS tracks only the touched pages.

Stress test (1000 .incbin "blob.bin", 0, 16 against a 1 MiB blob):

```
                  Maximum RSS
  Before          1042944 KiB

    [3 lines not shown]
DeltaFile
+7-4llvm/lib/MC/MCParser/AsmParser.cpp
+7-3llvm/lib/Support/SourceMgr.cpp
+2-1llvm/include/llvm/Support/SourceMgr.h
+16-83 files

LLVM/project d47012fclang/include/clang/Analysis/Analyses/LifetimeSafety Origins.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp Origins.cpp

[LifetimeSafety] Track per-field origins for record types
DeltaFile
+314-4clang/test/Sema/warn-lifetime-safety.cpp
+82-43clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+95-8clang/lib/Analysis/LifetimeSafety/Origins.cpp
+67-24clang/include/clang/Analysis/Analyses/LifetimeSafety/Origins.h
+21-12clang/lib/Analysis/LifetimeSafety/LiveOrigins.cpp
+4-6clang/test/Sema/warn-lifetime-safety-dangling-field.cpp
+583-972 files not shown
+587-988 files

LLVM/project 0cb103dllvm/test/Transforms/SLPVectorizer/X86 arith-mul-smulo.ll arith-sub-usubo.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+549-615llvm/test/Transforms/SLPVectorizer/X86/arith-mul-smulo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-sub-usubo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-add-saddo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-add-uaddo.ll
+449-615llvm/test/Transforms/SLPVectorizer/X86/arith-sub-ssubo.ll
+429-615llvm/test/Transforms/SLPVectorizer/X86/arith-mul-umulo.ll
+2,774-3,6904 files not shown
+3,268-3,91310 files

LLVM/project 6c979bbllvm/include/llvm/MC MCAsmInfo.h, llvm/lib/Target/X86/MCTargetDesc X86MCTargetDesc.cpp X86MCAsmInfo.cpp

[X86] Hoist ReservedIdentifiers to MCAsmInfo and shrink setup cost. NFC (#196699)

PR #186570 added a per-MCAsmInfo `StringSet<>` populated with X86
register names plus Intel-syntax keywords, which caused a minor
instructions:u increase.

Avoid heap allocation and hoist `ReservedIdentifiers` to MCAsmInfo for
other targets.

For the register-name source, prefer
`X86IntelInstPrinter::getRegisterName` over `MCRegisterInfo::getName`.
The former is a TableGen-emitted accessor into a `static const char
AsmStrs[]` pool in `X86GenAsmWriter1.inc`, populated from the lowercase
asm-name argument of each `def XX : X86Reg<"xx", ...>;` in
`X86RegisterInfo.td`.
DeltaFile
+22-33llvm/lib/Target/X86/MCTargetDesc/X86MCTargetDesc.cpp
+9-9llvm/lib/Target/X86/MCTargetDesc/X86MCAsmInfo.cpp
+14-0llvm/include/llvm/MC/MCAsmInfo.h
+0-5llvm/lib/Target/X86/MCTargetDesc/X86MCAsmInfo.h
+45-474 files

LLVM/project 72a340bllvm/include/llvm/Target TargetSelectionDAG.td, llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp LegalizeTypes.h

[SelectionDAG] Split vector types for atomic load

Vector types that aren't widened are split
so that a single ATOMIC_LOAD is issued for the entire vector at once.
This change utilizes the load vectorization infrastructure in
SelectionDAG in order to group the vectors. This enables SelectionDAG
to translate vectors with type bfloat,half.
DeltaFile
+349-4llvm/test/CodeGen/X86/atomic-load-store.ll
+34-0llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+14-0llvm/include/llvm/Target/TargetSelectionDAG.td
+1-0llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+398-44 files

LLVM/project 97ba148clang/docs ReleaseNotes.rst, clang/lib/Sema SemaDecl.cpp

Revert "Avoid assert in substqualifier (#182707)"

This reverts commit e2def106757534b07a2d3ff15ddd48e14b69a66d.
DeltaFile
+0-12clang/test/SemaTemplate/GH176152.cpp
+1-3clang/lib/Sema/SemaDecl.cpp
+0-1clang/docs/ReleaseNotes.rst
+1-163 files

LLVM/project ccc2b0clibcxx/include/__memory uninitialized_algorithms.h, libcxx/test/std/containers/sequences/vector/vector.cons copy.pass.cpp

[libc++] Avoid non-trivial assignment in `__uninitialized_allocator_copy_impl`

__uninitialized_allocator_copy_impl has an optimization that replaces allocator_traits::construct with std::copy for raw pointer ranges when the element type is trivially copy constructible and trivially copy assignable.

The copy-assignment trait only checks whether assignment from const T& is trivial. That is weaker than the expression used by std::copy, which evaluates *out = *in. If overload resolution selects a different non-trivial assignment operator for that expression, std::copy can call that operator on uninitialized storage.

Const-qualify the input pointers in the optimized overload instead. This makes the std::copy expression assign from const T&, matching the existing is_trivially_copy_assignable check, preserving the optimized path when that assignment is trivial, and falling back to placement construction otherwise.

Add a vector copy-constructor regression test with a type whose defaulted copy assignment is trivial but whose templated assignment operator is selected for non-const lvalue sources.

Tested with:
build/bin/llvm-lit -q build/runtimes/runtimes-bins/libcxx/test --filter='(vector.cons/copy.pass|uninitialized_allocator_copy\\.pass)'
build/bin/llvm-lit -q build/runtimes/runtimes-bins/libcxx/test --param std=c++20 --filter='vector.cons/copy.pass'
build/bin/llvm-lit -q build/runtimes/runtimes-bins/libcxx/test --param std=c++11 --filter='vector.cons/copy.pass'
DeltaFile
+76-1libcxx/test/std/containers/sequences/vector/vector.cons/copy.pass.cpp
+1-1libcxx/include/__memory/uninitialized_algorithms.h
+77-22 files

LLVM/project d118474libcxx/include/__memory uninitialized_algorithms.h, libcxx/test/libcxx/memory uninitialized_allocator_copy_template_op_assign.pass.cpp

[libc++] Avoid non-trivial assignment in `__uninitialized_allocator_copy_impl`

__uninitialized_allocator_copy_impl has an optimization that replaces allocator_traits::construct with std::copy for raw pointer ranges when the element type is trivially copy constructible and trivially copy assignable.

The copy-assignment trait only checks whether assignment from const T& is trivial. That is weaker than the expression used by std::copy, which evaluates *out = *in. If overload resolution selects a different non-trivial assignment operator for that expression, std::copy can call that operator on uninitialized storage.

Const-qualify the input pointers in the optimized overload instead. This makes the std::copy expression assign from const T&, matching the existing is_trivially_copy_assignable check, preserving the optimized path when that assignment is trivial, and falling back to placement construction otherwise.

Add a regression test with a type whose defaulted copy assignment is trivial but whose templated assignment operator is selected for non-const lvalue sources.

Tested with:
build/bin/llvm-lit -q build/runtimes/runtimes-bins/libcxx/test --filter='uninitialized_allocator_copy(\\.pass|_template_op_assign)'
DeltaFile
+77-0libcxx/test/libcxx/memory/uninitialized_allocator_copy_template_op_assign.pass.cpp
+1-1libcxx/include/__memory/uninitialized_algorithms.h
+78-12 files

LLVM/project 45e5bfbllvm/test/Transforms/SLPVectorizer/X86 struct-return-different-bb.ll

[SLP][NFC]Add a test with struct-returning intrinsics in different basic blocks, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/196748
DeltaFile
+51-0llvm/test/Transforms/SLPVectorizer/X86/struct-return-different-bb.ll
+51-01 files

LLVM/project 9465cf9llvm/lib/Analysis ConstantFolding.cpp, llvm/lib/IR Constants.cpp ConstantFold.cpp

[RFC][NFCI][Constants] Add `Constant::isZeroValue`

The old `isZeroValue` was removed because it was functionally identical to
`Constant::isNullValue`. Currently, a "null value" in LLVM means a zero value.
We are moving toward changing the semantics of `ConstantPointerNull` to
represent a semantic null pointer instead of a zero-valued pointer. As a result,
the meaning of "null value" will also change in the future.

This PR series is the first step toward renaming the two widely used "null
value" interfaces to "zero value". As the first PR in the series, this change
adds a "new" `isZeroValue` alongside `isNullValue`, and makes `isNullValue` call
`isZeroValue` directly. Then, all uses of `isNullValue` in LLVM are replaced
with `isZeroValue`. Uses in other projects will be updated in separate PRs.

The plan is to eventually remove `isNullValue` after all uses have been
migrated.
DeltaFile
+15-15llvm/lib/Analysis/ConstantFolding.cpp
+14-14llvm/lib/IR/Constants.cpp
+11-11llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+11-9llvm/lib/IR/ConstantFold.cpp
+9-9llvm/unittests/Analysis/ValueLatticeTest.cpp
+9-9llvm/lib/Transforms/Utils/SimplifyCFG.cpp
+69-67100 files not shown
+276-265106 files

LLVM/project 7c0ae9cllvm/test/Transforms/SLPVectorizer/RISCV scalable-type-as-input.ll

[SLP][NFC]Add a test with scalable vector type in struct-returning intrinsic, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/196747
DeltaFile
+32-0llvm/test/Transforms/SLPVectorizer/RISCV/scalable-type-as-input.ll
+32-01 files

LLVM/project 968430ellvm/docs AMDGPUUsage.rst, llvm/lib/Target/AMDGPU AMDGPUAsmPrinter.cpp

[AMDGPU] Add `.amdgpu.info` section for per-function metadata

AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.

This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:

```
[kind: u8] [len: u8] [payload: <len> bytes]
```

A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.

    [4 lines not shown]
DeltaFile
+257-0llvm/test/CodeGen/AMDGPU/lds-link-time-codegen-typeid.ll
+179-0llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp
+158-2llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+126-0llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+126-0llvm/test/MC/AMDGPU/amdgpu-info-roundtrip.s
+106-0llvm/docs/AMDGPUUsage.rst
+952-29 files not shown
+1,261-1415 files

LLVM/project 492d774mlir/include/mlir/Dialect/SPIRV/IR TargetAndABI.h, mlir/lib/Conversion/SCFToSPIRV SCFToSPIRV.cpp

[mlir][SPIR-V] Support spirv.selection_control attribute on scf.if (#196510)
DeltaFile
+22-0mlir/test/Conversion/SCFToSPIRV/if.mlir
+11-0mlir/test/Dialect/SPIRV/IR/target-and-abi.mlir
+6-2mlir/lib/Conversion/SCFToSPIRV/SCFToSPIRV.cpp
+4-0mlir/lib/Dialect/SPIRV/IR/TargetAndABI.cpp
+4-0mlir/lib/Dialect/SPIRV/IR/SPIRVDialect.cpp
+3-0mlir/include/mlir/Dialect/SPIRV/IR/TargetAndABI.h
+50-26 files

LLVM/project fab2603llvm/lib/Target/AMDGPU VOP1Instructions.td VOPInstructions.td

[AMDGPU] Add VOP1 DPP8 pseudo infrastructure

Add VOP_DPP8_Pseudo/VOP1_DPP8_Pseudo classes for DPP8 instructions, similar to
the existing VOP_DPP_Pseudo/VOP1_DPP_Pseudo pattern.
DeltaFile
+20-17llvm/lib/Target/AMDGPU/VOP1Instructions.td
+25-0llvm/lib/Target/AMDGPU/VOPInstructions.td
+45-172 files

LLVM/project 900dd1dclang/lib/Driver/ToolChains AMDGPU.cpp

clang/AMDGPU: Use all_equal instead of building a temporary set (#196742)
DeltaFile
+1-2clang/lib/Driver/ToolChains/AMDGPU.cpp
+1-21 files

LLVM/project ee29cb1clang/test/Preprocessor predefined-arch-macros.c

clang: Fix using -march=amdgcn in some r600 run lines (#196745)
DeltaFile
+2-2clang/test/Preprocessor/predefined-arch-macros.c
+2-21 files

LLVM/project 2d8bcb5llvm/lib/Transforms/Vectorize VPlanRecipes.cpp VPlanUtils.cpp

[VPlan] Lift isUsedByLoadStoreAddr into vputils, operate on VPValue(NFC) (#196415)

Extract the helper previously scoped to VPReplicateRecipe::computeCost
and make it available from VPlanUtils so other transforms can query
whether a VPValue is used as part of another load or store's address.

Also relax the input type from VPUser * to VPValue *: the worklist now
tracks VPValues directly, and traversal is gated on the user being a
VPSingleDefRecipe before walking its own users. This is NFC for the
existing caller.
DeltaFile
+1-51llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+51-0llvm/lib/Transforms/Vectorize/VPlanUtils.cpp
+4-0llvm/lib/Transforms/Vectorize/VPlanUtils.h
+56-513 files

LLVM/project 615a7e0mlir/lib/Conversion/MathToSPIRV MathToSPIRV.cpp, mlir/test/Conversion/MathToSPIRV math-to-opencl-spirv.mlir

[mlir][SPIR-V] Convert math.fpowi to spirv.CL.pown (#196701)
DeltaFile
+21-1mlir/lib/Conversion/MathToSPIRV/MathToSPIRV.cpp
+14-0mlir/test/Conversion/MathToSPIRV/math-to-opencl-spirv.mlir
+35-12 files

LLVM/project cb5d076clang/docs ClangFormatStyleOptions.rst, clang/include/clang/Format Format.h

[clang-format] Add BreakFunctionDeclarationParameters option. (#196567)

Adds an option the break function declaration parameters, always putting
them on the next line after the function opening parentheses.

This is an equivalent of `BreakFunctionDefinitionParameters`, but for
function declarations.

---------

Co-authored-by: Lukas Jirkovsky <lukas.jirkovsky at aveco.com>
DeltaFile
+27-0clang/unittests/Format/FormatTest.cpp
+16-0clang/include/clang/Format/Format.h
+15-0clang/docs/ClangFormatStyleOptions.rst
+9-0clang/unittests/Format/AlignBracketsTest.cpp
+6-0clang/lib/Format/TokenAnnotator.cpp
+3-0clang/lib/Format/Format.cpp
+76-02 files not shown
+79-08 files

LLVM/project a2d17c5clang/test/Preprocessor predefined-arch-macros.c

clang: Fix using -march=amdgcn in some r600 run lines
DeltaFile
+2-2clang/test/Preprocessor/predefined-arch-macros.c
+2-21 files

LLVM/project 29a9658clang/lib/Driver/ToolChains AMDGPU.cpp

clang/AMDGPU: Use all_equal instead of building a temporary set

Addresses comment on #196373
DeltaFile
+1-2clang/lib/Driver/ToolChains/AMDGPU.cpp
+1-21 files

LLVM/project aa68a9cllvm/test/CodeGen/AArch64 bf16-v8-instructions.ll bf16-instructions.ll, llvm/test/CodeGen/AMDGPU load-atomic-global.ll

Merge branch 'main' into users/kparzysz/control-driver-warnings
DeltaFile
+4,634-367llvm/test/CodeGen/RISCV/rvv/fixed-vectors-reduction-fp.ll
+3,071-1,257llvm/test/CodeGen/AArch64/bf16-v8-instructions.ll
+1,660-649llvm/test/CodeGen/AArch64/bf16-instructions.ll
+1,440-725llvm/test/CodeGen/AArch64/bf16-v4-instructions.ll
+1,608-0llvm/test/MC/AMDGPU/gfx13_asm_vop3p.s
+1,246-0llvm/test/CodeGen/AMDGPU/load-atomic-global.ll
+13,659-2,9981,703 files not shown
+48,886-20,0041,709 files