LLVM/project 255af94clang/lib/CodeGen CGExpr.cpp, clang/test/CodeGenHLSL BoolMatrix.hlsl

[HLSL][Matrix] Update indexed matrix elements individually (#176216)

Fixes #174629

This PR is similar to that of #169144 but for matrices.

When storing to a matrix element or matrix row, `insertelement`
instructions have been replaced by GEPs followed by stores to individual
matrix elements. There is no longer storing of the entire matrix to
memory all at once, thus avoiding data races when writing to independent
matrix elements from multiple threads.
DeltaFile
+76-66clang/test/CodeGenHLSL/BasicFeatures/MatrixSingleSubscriptSetter.hlsl
+48-19clang/lib/CodeGen/CGExpr.cpp
+14-11clang/test/CodeGenHLSL/BasicFeatures/MatrixSingleSubscriptDynamicSwizzle.hlsl
+12-10clang/test/CodeGenHLSL/BasicFeatures/MatrixSingleSubscriptConstSwizzle.hlsl
+7-10clang/test/CodeGenHLSL/BoolMatrix.hlsl
+3-4clang/test/CodeGenHLSL/BasicFeatures/matrix-type-indexing.hlsl
+160-1206 files

LLVM/project 2042887llvm/docs MIRLangRef.rst, llvm/include/llvm/CodeGen MachineInstrBuilder.h

Reland "[NFC][MI] Tidy Up RegState enum use (1/2)" (#176277)

This Change is to prepare to make RegState into an enum class. It:
- Updates documentation to match the order in the code.
- Brings the `get<>RegState` functions together and makes them
`constexpr`.
- Adopts the `get<>RegState` where RegStates were being chosen with
ternary operators in backend code.
- Introduces `hasRegState` to make querying RegState easier once it is
an enum class.
- Adopts `hasRegState` where equivalent was done with bitwise
arithmetic.
- Introduces `RegState::NoFlags`, which will be used for the lack of
flags.
- Documents that `0x1` is a reserved flag value used to detect if
someone is passing `true` instead of flags (due to implicit bool to
unsigned conversions).
- Updates two calls to `MachineInstrBuilder::addReg` which were passing
`false` to the flags operand, to no longer pass a value.

    [6 lines not shown]
DeltaFile
+66-51llvm/include/llvm/CodeGen/MachineInstrBuilder.h
+17-17llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+15-15llvm/docs/MIRLangRef.rst
+14-10llvm/lib/CodeGen/MIRParser/MIParser.cpp
+8-9llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+8-8llvm/lib/Target/ARM/ARMExpandPseudoInsts.cpp
+128-11017 files not shown
+156-14223 files

LLVM/project 2bcd2f2clang/include/clang/Analysis/Analyses/LifetimeSafety FactsGenerator.h Loans.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp

[LifetimeSafety] Track moved declarations to prevent false positives (#170007)

Prevent false positives in lifetime safety analysis when variables are
moved using `std::move`.

When a value is moved using `std::move`, ownership is transferred from
the original variable to another. The lifetime safety analysis was
previously generating false positives by warning about
use-after-lifetime when the original variable was destroyed after being
moved. This change prevents those false positives by tracking moved
declarations and exempting them from loan expiration checks.

- Added tracking for declarations that have been moved via `std::move`
in the `FactsGenerator` class
- Added a `MovedDecls` set to track moved declarations in a
flow-insensitive manner
- Implemented detection of `std::move` calls in `VisitCallExpr`
- Modified `handleLifetimeEnds` to skip loans for declarations that have
been moved

    [17 lines not shown]
DeltaFile
+39-0clang/test/Sema/warn-lifetime-safety.cpp
+19-0clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+9-0clang/include/clang/Analysis/Analyses/LifetimeSafety/FactsGenerator.h
+2-0clang/include/clang/Analysis/Analyses/LifetimeSafety/Loans.h
+69-04 files

LLVM/project ae99a75llvm/lib/Target/AArch64 AArch64PrologueEpilogue.cpp, llvm/test/CodeGen/AArch64 seh-minimal-prologue-epilogue.ll

[llvm][aarch64] MachO platforms do not use .seh_ (#176456)

DeltaFile
+10-8llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll
+1-1llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+11-92 files

LLVM/project 5e2f43bflang/test/Integration complex-div-to-llvm.f90

[flang][AIX] update test to handle different alignments (NFC) (#176431)

DeltaFile
+4-4flang/test/Integration/complex-div-to-llvm.f90
+4-41 files

LLVM/project ba43338clang/include/clang/Analysis/Analyses/LifetimeSafety FactsGenerator.h Loans.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp

std_move false positive
DeltaFile
+39-0clang/test/Sema/warn-lifetime-safety.cpp
+19-0clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+9-0clang/include/clang/Analysis/Analyses/LifetimeSafety/FactsGenerator.h
+2-0clang/include/clang/Analysis/Analyses/LifetimeSafety/Loans.h
+69-04 files

LLVM/project 64262d2llvm/test/MC/AMDGPU gfx10_asm_vopc_e64.s gfx10_asm_vop1.s, llvm/test/MC/Disassembler/AMDGPU gfx10_vop3c.txt gfx10_vop3.txt

Merge branch 'users/abhinavgaba/udp-fallback-3' into users/abhinavgaba/udp-fallback-4
DeltaFile
+10,845-10,844llvm/test/MC/AMDGPU/gfx10_asm_vopc_e64.s
+5,425-5,424llvm/test/MC/AMDGPU/gfx10_asm_vop1.s
+5,392-5,392llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3c.txt
+4,676-4,675llvm/test/MC/AMDGPU/gfx10_asm_vop3.s
+4,672-4,671llvm/test/MC/AMDGPU/gfx10_asm_vop2.s
+3,733-3,733llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3.txt
+34,743-34,7393,206 files not shown
+286,986-193,8393,212 files

LLVM/project ec399efmlir/lib/Conversion/ArithAndMathToAPFloat CMakeLists.txt

[mlir] fix math-to-apfloat after #172715 (#176462)

DeltaFile
+2-0mlir/lib/Conversion/ArithAndMathToAPFloat/CMakeLists.txt
+2-01 files

LLVM/project d6653aamlir/lib/Conversion/ArithAndMathToAPFloat CMakeLists.txt

[mlir] fix math-to-apfloat after #172715
DeltaFile
+1-0mlir/lib/Conversion/ArithAndMathToAPFloat/CMakeLists.txt
+1-01 files

LLVM/project cbea563llvm/include/llvm/IR GlobalObject.h, llvm/lib/CodeGen GlobalMerge.cpp

GlobalMerge: Do not merge globals with non-dbg metadata.

As noticed during the review of #149260, this transformation
is not necessarily correct for all metadata types.

Reviewers: efriedma-quic

Pull Request: https://github.com/llvm/llvm-project/pull/175875
DeltaFile
+15-0llvm/test/Transforms/GlobalMerge/metadata2.ll
+15-0llvm/test/Transforms/GlobalMerge/metadata1.ll
+3-12llvm/lib/Transforms/IPO/ConstantMerge.cpp
+10-0llvm/lib/CodeGen/GlobalMerge.cpp
+9-0llvm/lib/IR/Globals.cpp
+2-0llvm/include/llvm/IR/GlobalObject.h
+54-126 files

LLVM/project 616af49llvm/lib/Transforms/AggressiveInstCombine AggressiveInstCombine.cpp, llvm/test/Transforms/AggressiveInstCombine/AArch64 or-load.ll

[AggressiveInstCombine] Allow load folding for root inst with multiple uses. (#176101)

The load folding optimization was very conservative by requiring the
root OR instruction to have a single use. This prevented optimization
when to fold loads when only the root had multiple uses.

For example:
  %val = or i32 ...     ; Assembles 4 bytes to i32
  %use1 = call @foo(%val)
  %use2 = call @bar(%val)
DeltaFile
+92-21llvm/test/Transforms/AggressiveInstCombine/X86/or-load.ll
+54-0llvm/test/Transforms/AggressiveInstCombine/AArch64/or-load.ll
+40-0llvm/test/Transforms/AggressiveInstCombine/AMDGPU/fold-loads-multiple-uses.ll
+15-11llvm/lib/Transforms/AggressiveInstCombine/AggressiveInstCombine.cpp
+201-324 files

LLVM/project 42065cfclang/lib/Serialization ASTReaderDecl.cpp, clang/test/Modules pr149404-02.cppm pr172241.cppm

Revert "[Serialization] Stop demote var definition as declaration" (#176441)

Reverts #172430 (c560f1cf03aa06c0bdd00c5a9b558c16d882af6f).

Causes some failures like `error: static assertion expression is not an
integral constant expression` and `error: substitution into constraint
expression resulted in a non-constant expression` in modules builds.
Repro TBD.
DeltaFile
+0-104clang/test/Modules/pr149404-02.cppm
+0-47clang/test/Modules/pr172241.cppm
+14-0clang/lib/Serialization/ASTReaderDecl.cpp
+14-1513 files

LLVM/project 7e3255fllvm/lib/Target/AMDGPU AMDGPUPreloadKernelArguments.cpp, llvm/test/CodeGen/AMDGPU preload-kernargs-aggregates.ll

[RFC][AMDGPU] Enable simple aggregate types for kernel argument preload

This PR enables kernel argument preload for plain aggregate types.

Fixes SWDEV-575961.
DeltaFile
+421-0llvm/test/CodeGen/AMDGPU/preload-kernargs-aggregates.ll
+18-2llvm/lib/Target/AMDGPU/AMDGPUPreloadKernelArguments.cpp
+439-22 files

LLVM/project 1994152clang/include/clang/Sema SemaOpenMP.h

Remove unrelated code.
DeltaFile
+0-4clang/include/clang/Sema/SemaOpenMP.h
+0-41 files

LLVM/project cd273c6llvm/test/MC/AMDGPU gfx10_asm_vopc_e64.s gfx10_asm_vop1.s, llvm/test/MC/Disassembler/AMDGPU gfx10_vop3c.txt gfx10_vop3.txt

Merge remote-tracking branch 'upstream/main' into users/abhinavgaba/udp-fallback-3
DeltaFile
+10,845-10,844llvm/test/MC/AMDGPU/gfx10_asm_vopc_e64.s
+5,425-5,424llvm/test/MC/AMDGPU/gfx10_asm_vop1.s
+5,392-5,392llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3c.txt
+4,676-4,675llvm/test/MC/AMDGPU/gfx10_asm_vop3.s
+4,672-4,671llvm/test/MC/AMDGPU/gfx10_asm_vop2.s
+3,733-3,733llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3.txt
+34,743-34,7393,206 files not shown
+286,986-193,8353,212 files

LLVM/project 725bb5bclang/include/clang/AST OpenMPClause.h, clang/lib/AST OpenMPClause.cpp

[OpenMP][Clang] Parsing/Sema support for `use_device_ptr(fb_preserve/fb_nullify)`. (2/4) (#170578)

Depends on #169603.
    
This is the `use_device_ptr` counterpart of #168905.
    
With OpenMP 6.1, a `fallback` modifier can be specified on the
`use_device_ptr` clause to control the behavior when a pointer lookup
fails, i.e. there is no device pointer to translate into.
    
The default is `fb_preserve` (i.e. retain the original pointer), while
`fb_nullify` means: use `nullptr` as the translated pointer.

Dependent PR: #173930.
DeltaFile
+37-5clang/include/clang/AST/OpenMPClause.h
+36-0clang/test/OpenMP/target_data_use_device_ptr_fallback_ast_print.cpp
+32-0clang/test/OpenMP/target_data_use_device_ptr_fallback_messages.cpp
+20-2clang/lib/Basic/OpenMPKinds.cpp
+13-5clang/lib/Sema/SemaOpenMP.cpp
+14-3clang/lib/AST/OpenMPClause.cpp
+152-157 files not shown
+203-2313 files

LLVM/project 06fd0a5llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s

Rebase

Created using spr 1.3.5
DeltaFile
+85,134-81,237llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+26,294-24,884llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+259,799-254,48841,872 files not shown
+5,146,461-2,605,67441,878 files

LLVM/project bc8c8a4llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.5

[skip ci]
DeltaFile
+85,134-81,237llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+26,294-24,884llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+259,799-254,48841,872 files not shown
+5,146,461-2,605,67441,878 files

LLVM/project 0ee7accclang-tools-extra/clangd/refactor/tweaks DefineOutline.cpp, clang-tools-extra/clangd/unittests/tweaks DefineOutlineTests.cpp

[clangd] Support `=default` in DefineOutline to find insertion point (#175618)

Since #128164, the DefineOutline tweak is looking for a good insertion
point by looking at where neighboring functions are defined. That
heuristic didn't yet handle `= default`. This commit adds support for
it.
DeltaFile
+46-18clang-tools-extra/clangd/refactor/tweaks/DefineOutline.cpp
+31-0clang-tools-extra/clangd/unittests/tweaks/DefineOutlineTests.cpp
+77-182 files

LLVM/project b318625llvm/docs NVPTXUsage.rst

[NFC][NVPTX] Reformat NVPTXUsage to use 80 col width (#176425)

DeltaFile
+565-467llvm/docs/NVPTXUsage.rst
+565-4671 files

LLVM/project 09a4058flang/lib/Parser source.cpp, flang/test/Driver/input-from-stdin debug-info-filename.f90

[flang][debuginfo] Use <stdin> for file name when reading from stdin

Currently, the DIFile debuginfo nodes use "standard input" as the file
name when compiling with -g and reading input from stdin. This has been
changed to "<stdin>" for consistency with clang and gfortran.

Fixes #60288
DeltaFile
+14-0flang/test/Driver/input-from-stdin/debug-info-filename.f90
+1-1flang/lib/Parser/source.cpp
+15-12 files

LLVM/project 88fa12aclang/include/clang/AST VTableBuilder.h, llvm/include/llvm/CGData CGDataPatchItem.h

Use `llvm::SmallVector` instead of `std::vector`
DeltaFile
+3-2clang/include/clang/AST/VTableBuilder.h
+2-2llvm/include/llvm/CGData/CGDataPatchItem.h
+2-1llvm/include/llvm/CodeGen/PBQP/Math.h
+1-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+8-64 files

LLVM/project 10e1bd7llvm/include/llvm/Support GenericDomTree.h

[Support][NFC] Use default move constr/assign for DomTree (#176423)

Added in 5e10e21d28496ba40ccd385740d7d1b4bb1368e4, hopefully MSVC can
generate the correct code 11 years later? (If not, we probably have
other problems, too? Unfortunately, I couldn't find information on what
the actual problem was.)

The explicit move constructor/assignment are error-prone to
modifications, because it is easy to forget updating them when modifying
members.
DeltaFile
+3-35llvm/include/llvm/Support/GenericDomTree.h
+3-351 files

LLVM/project 9b409e3lld/ELF SyntheticSections.cpp, lld/test/ELF mips-tls-64.s mips-mgot.s

[ELF][Mips] Fix addend for preemptible static TLS

If the symbol is preemptible the addend should be 0, not our
definition's VA. Note that by using addAddendOnlyRelocIfNonPreemptible
the generic Elf_Rel code will ensure the VA is written out as the addend
if the symbol is non-preemptible, and so writeTo only needs to write out
the VA in the case that we don't call it (so long as we make sure to
call relocateAlloc to actually apply any such relocations).

Reviewers: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/150729
DeltaFile
+7-5lld/ELF/SyntheticSections.cpp
+1-1lld/test/ELF/mips-tls-64.s
+1-1lld/test/ELF/mips-mgot.s
+1-1lld/test/ELF/mips-tls.s
+10-84 files

LLVM/project a30120fllvm/lib/Target/X86 X86LoadValueInjectionRetHardening.cpp X86.h

[NewPM] port x86-lvi-ret to new pass manager (#176242)

DeltaFile
+31-17llvm/lib/Target/X86/X86LoadValueInjectionRetHardening.cpp
+10-2llvm/lib/Target/X86/X86.h
+2-2llvm/lib/Target/X86/X86TargetMachine.cpp
+1-2llvm/lib/Target/X86/X86CodeGenPassBuilder.cpp
+1-1llvm/lib/Target/X86/X86PassRegistry.def
+45-245 files

LLVM/project dd29183llvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp SelectionDAG.cpp, llvm/test/CodeGen/X86 combine-umax.ll combine-umin.ll

[DAG] Allow MIN/MAX signedness flip when operands are known-negative (#174469)

Extend the existing DAGCombine logic in visitIMINMAX so that signed and
unsigned MIN/MAX can be flipped not only when both operands are known
non-negative but also when both operands are known negative. This
replaces the old SignBitIsZero checks with computeKnownBits and explicit
tests for non-negative or negative operands while keeping all existing
legality and saturation gating in place. Add regression tests to cover
both the known-negative case and the known-non-negative case.

Fixes #174325
DeltaFile
+48-0llvm/test/CodeGen/X86/combine-umax.ll
+48-0llvm/test/CodeGen/X86/combine-umin.ll
+47-0llvm/test/CodeGen/X86/combine-smin.ll
+47-0llvm/test/CodeGen/X86/combine-smax.ll
+22-12llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+15-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+227-121 files not shown
+232-127 files

LLVM/project e9e0206llvm/lib/Transforms/InstCombine InstCombineSimplifyDemanded.cpp, llvm/test/Transforms/InstCombine simplify-demanded-fpclass.ll

InstCombine: Improve single-use fneg(fabs(x)) SimplifyDemandedFPClass handling

Match the multi-use case's logic for understanding no-nan/no-inf context.
Also only apply the nsz handling in the single use case. alive2 seems to treat
nsz as nondeterministic for each use.
DeltaFile
+248-12llvm/test/Transforms/InstCombine/simplify-demanded-fpclass.ll
+73-19llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp
+321-312 files

LLVM/project 99bb664clang/test/CodeGenObjC expose-direct-method-cross-linkage.m

Add linking tests
DeltaFile
+175-0clang/test/CodeGenObjC/expose-direct-method-cross-linkage.m
+175-01 files

LLVM/project c887cd2llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/PowerPC vector-popcnt-128-ult-ugt.ll

Rebase prior to landing

Created using spr 1.3.5
DeltaFile
+84,445-80,574llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+26,294-24,884llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+22,442-22,438llvm/test/CodeGen/PowerPC/vector-popcnt-128-ult-ugt.ll
+40,677-0llvm/test/CodeGen/RISCV/rvv/nontemporal-vp-scalable.ll
+17,545-20,831llvm/test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll
+25,714-0llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-cluster.ll
+217,117-148,72738,354 files not shown
+3,688,200-1,396,78338,360 files

LLVM/project 18695b2llvm/docs AIToolPolicy.md DeveloperPolicy.rst

[docs] Add human-in-the-loop policy for tool-assisted contributions (#154441)

Over the course of 2025, we observed an increase in the volume of
LLM-assisted nuisance contributions to the project. Nuisance
contributions have always been an issue for open-source projects, but
until LLMs, we made do without a formal policy banning such
contributions. However, LLMs are here, so we are adopting this policy,
abbreviated as "human in the loop", which requires that every
contribution has a human author attesting to the value of that
contribution, and that it is high enough quality that it is worth the
time it takes to review the contribution.

This policy evolved over time based on community input from the
following Discourse threads and a few area team and LLVM project council
meetings:
* [Our AI policy vs code of conduct and vs
reality](https://discourse.llvm.org/t/our-ai-policy-vs-code-of-conduct-and-vs-reality/88300)
* [[RFC] LLVM AI tool policy: start small, no slop](https://discourse.llvm.org/t/rfc-llvm-ai-tool-policy-start-small-no-slop/88476)
* [[RFC] LLVM AI tool policy: human in the

    [5 lines not shown]
DeltaFile
+181-0llvm/docs/AIToolPolicy.md
+1-26llvm/docs/DeveloperPolicy.rst
+1-0llvm/docs/Reference.rst
+183-263 files