LLVM/project e08aa00llvm/include/llvm/MC MCSymbol.h MCGOFFStreamer.h

Fix automatically generated include lines
DeltaFile
+1-2llvm/include/llvm/MC/MCSymbol.h
+1-1llvm/include/llvm/MC/MCGOFFStreamer.h
+1-1llvm/include/llvm/MC/MCObjectStreamer.h
+3-43 files

LLVM/project c6013a1clang/include/clang/Analysis/Analyses/LifetimeSafety Loans.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp Loans.cpp

[LifetimeSafety] Add support for tracking non-trivially destructed temporary objects (#172007)

Add support for tracking loans to temporary materializations that
require non-trivial destructors. We only support non-trivially
destructed temporaries as they have a nice end-of-life marker via the
`CFGTemporaryDtor`.

This small PR introduces the following changes:
1. AccessPaths can now also represent `MaterializeTemporaryExpr *` via
`llvm::PointerUnion`
3. `FactsGenerator::VisitMaterializeTemporaryExpr` now checks to see if
the temporary materialization is such that it requires a non-trivial
destructor (by checking for a child `CXXBindTemporaryExpr` node when all
implicit casts are stripped away), and if so: creates a Loan whose
AccessPath is a pointer to that `MaterializeTemporaryExpr`, and issues
it to the origin represented by the `MaterializeTemporaryExpr` node we
were called on. When we cannot find a child `CXXBindTemporaryExpr`, we
fall-back to an `OriginFlow` as before.
4. `FactsGenerator::handleTemporaryDtor` is called from

    [11 lines not shown]
DeltaFile
+64-11clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+45-7clang/unittests/Analysis/LifetimeSafetyTest.cpp
+37-1clang/test/Sema/warn-lifetime-safety.cpp
+19-2clang/include/clang/Analysis/Analyses/LifetimeSafety/Loans.h
+9-1clang/lib/Analysis/LifetimeSafety/Loans.cpp
+2-2clang/lib/Analysis/LifetimeSafety/Origins.cpp
+176-241 files not shown
+178-247 files

LLVM/project 793d9c9flang-rt CMakeLists.txt

[flang-rt] Disable testing support for the GPU cross build (#175583)

Summary:
We're starting to provide the GPU version of the Fortran runtime with
the GPU cross-build semantics. This does not support tests right now but
will attempt to build the unit tests and fail to find gtest for the GPU.
Disable this for now so it can build.
DeltaFile
+6-1flang-rt/CMakeLists.txt
+6-11 files

LLVM/project a33654bllvm/test/Transforms/SampleProfile fn-alias.ll, llvm/test/Transforms/SampleProfile/Inputs fn-alias.prof

[SampleProf] test that calls to function aliases get profile info (#169355)

When a call is made to a function alias, in
SampleProfileLoader::generateMDProfMetadata we look up the actual call
target name in the profile to resolve the alias, in the same way as we
do for indirect calls. Add a test for this so we don't lose profile info
on these calls some day.
DeltaFile
+37-0llvm/test/Transforms/SampleProfile/fn-alias.ll
+3-0llvm/test/Transforms/SampleProfile/Inputs/fn-alias.prof
+40-02 files

LLVM/project 25aeffdllvm/lib/Target/AArch64 AArch64SRLTDefineSuperRegs.cpp AArch64TargetMachine.cpp, llvm/test/CodeGen/AArch64 arm64-addrmode.ll subreg-liveness-fix-subreg-to-reg-implicit-def.mir

[AArch64] Add new pass after VirtRegRewriter to add implicit-defs

When SubRegister Liveness Tracking (SRLT) is enabled, this pass adds extra
implicit-def's to instructions that define the low N bits of a GPR/FPR
register to represent that the top bits are written, because all AArch64
instructions that write the low bits of a GPR/FPR also implicitly zero the
top bits.

These semantics are originally represented in the MIR using `SUBREG_TO_REG`,
but during register coalescing this information is lost and when rewriting
virtual -> physical registers the implicit-defs are not added to represent
the the top bits are written.

There have been several attempts to fix this in the coalescer (#168353),
but each iteration has exposed new bugs and the patch had to be reverted.
Additionally, the concept of adding 'implicit-def' of a virtual register
during the register allocation process is particularly fragile and many
places don't expect it (for example in `X86::commuteInstructionImpl` the
code only looks at specific operands and does not consider implicit-defs.

    [9 lines not shown]
DeltaFile
+265-0llvm/lib/Target/AArch64/AArch64SRLTDefineSuperRegs.cpp
+40-90llvm/test/CodeGen/AArch64/arm64-addrmode.ll
+88-0llvm/test/CodeGen/AArch64/subreg-liveness-fix-subreg-to-reg-implicit-def.mir
+15-1llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+5-10llvm/test/CodeGen/AArch64/preserve_nonecc_varargs_darwin.ll
+7-1llvm/lib/Target/AArch64/AArch64Subtarget.h
+420-1027 files not shown
+432-11013 files

LLVM/project 9f8b5e5llvm/lib/Target/AArch64 AArch64SRLTDefineSuperRegs.cpp, llvm/test/CodeGen/AArch64 subreg-liveness-fix-subreg-to-reg-implicit-def.mir

Address comments
DeltaFile
+35-52llvm/lib/Target/AArch64/AArch64SRLTDefineSuperRegs.cpp
+19-0llvm/test/CodeGen/AArch64/subreg-liveness-fix-subreg-to-reg-implicit-def.mir
+54-522 files

LLVM/project 51c93f8llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp

[Review] typos in comment
DeltaFile
+4-3llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+4-31 files

LLVM/project 0055129flang-rt/cmake/modules HandleLibs.cmake

[flang-rt] Fix unused flag warning when compiling for the GPU (#175643)

Summary:
Because we link the `cxx` target directly we do not need to use this
flag, that's also why we pass `-nostdinc++` which makes this an unused
command line flag, hence the warning.
DeltaFile
+0-4flang-rt/cmake/modules/HandleLibs.cmake
+0-41 files

LLVM/project d6dd604llvm/include/llvm/MC MCObjectStreamer.h MCGOFFStreamer.h

Fix order of includes
DeltaFile
+1-1llvm/include/llvm/MC/MCObjectStreamer.h
+1-1llvm/include/llvm/MC/MCGOFFStreamer.h
+1-1llvm/include/llvm/MC/MCSymbol.h
+3-33 files

LLVM/project 38bc101llvm/lib/Target/AArch64 AArch64LoadStoreOptimizer.cpp, llvm/test/CodeGen/AArch64 ldst-implicitop.mir

[AArch64] Let LoadStoreOptimizer handle renamable implicit-defs.

The LoadStoreOptimizer is very conservative with handling instructions
that have implicit-def operands, and only support them for 2 instructions.
However, they can be considered also when marked explicitly as 'renamable'.
DeltaFile
+29-0llvm/test/CodeGen/AArch64/ldst-implicitop.mir
+5-5llvm/lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
+34-52 files

LLVM/project 1856feallvm/include/llvm/MC MCGOFFStreamer.h MCObjectStreamer.h

Fix ABI annotations
DeltaFile
+2-1llvm/include/llvm/MC/MCGOFFStreamer.h
+2-1llvm/include/llvm/MC/MCObjectStreamer.h
+2-1llvm/include/llvm/MC/MCSymbol.h
+6-33 files

LLVM/project 125d24allvm/lib/Target/AMDGPU GCNSchedStrategy.h GCNSchedStrategy.cpp

[AMDGPU][Scheduler] Fix compile failure due to const/sort interaction (#175755)

On some configurations sorting `ScoredRemat` objects which contains
const members causes a compile failure due to impossibility of
swapping/moving objects. The problem was introduced in #175050.

This removes const from those fields to address the issue. The design
will soon change anyway to not rely on sorting objects of this type, and
consts were only here for semantic clarity.
DeltaFile
+4-4llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+2-3llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+6-72 files

LLVM/project 8722171llvm/test/CodeGen/AArch64 load-store-forwarding.ll

[AArch64][GlobalISel] Add GISel test coverage for load-store-forwarding.ll. NFC
DeltaFile
+58-23llvm/test/CodeGen/AArch64/load-store-forwarding.ll
+58-231 files

LLVM/project be70db6llvm/lib/MC MCGOFFStreamer.cpp, llvm/lib/Target/SystemZ/MCTargetDesc SystemZHLASMAsmStreamer.cpp

Fix formatting
DeltaFile
+2-2llvm/lib/MC/MCGOFFStreamer.cpp
+2-1llvm/lib/Target/SystemZ/MCTargetDesc/SystemZHLASMAsmStreamer.cpp
+4-32 files

LLVM/project 72d8c9allvm/include/llvm/Transforms/Scalar NaryReassociate.h, llvm/lib/Transforms/Scalar NaryReassociate.cpp

[NaryReassociate] Make uniformity-aware to prefer grouping uniform values
DeltaFile
+97-3llvm/lib/Transforms/Scalar/NaryReassociate.cpp
+5-5llvm/test/Transforms/NaryReassociate/AMDGPU/nary-add-uniform.ll
+6-3llvm/test/CodeGen/AMDGPU/llc-pipeline.ll
+3-1llvm/include/llvm/Transforms/Scalar/NaryReassociate.h
+111-124 files

LLVM/project 50d112cclang/lib/CodeGen CGExprScalar.cpp CodeGenFunction.cpp

[MC/DC] Prune MCDCLogOpStack and use CGF.isMCDCDecisionExpr. NFC (#125410)

`MCDCLogOpStack` is used only for detection of the Decision root. It can
be detected with `MCDC::State::DecisionByStmt`.
DeltaFile
+14-26clang/lib/CodeGen/CGExprScalar.cpp
+4-12clang/lib/CodeGen/CodeGenFunction.cpp
+14-0clang/lib/CodeGen/CodeGenPGO.h
+3-3clang/lib/CodeGen/CodeGenFunction.h
+6-0clang/lib/CodeGen/CodeGenPGO.cpp
+41-415 files

LLVM/project a3e9c45llvm/test/CodeGen/RISCV fpclamptosat.ll, llvm/test/MC/AMDGPU gfx10_asm_vopc_e64.s gfx10_asm_vop1.s

Merge branch 'main' into users/chapuni/mcdc/nest/covgen
DeltaFile
+10,845-10,844llvm/test/MC/AMDGPU/gfx10_asm_vopc_e64.s
+5,425-5,424llvm/test/MC/AMDGPU/gfx10_asm_vop1.s
+4,672-4,671llvm/test/MC/AMDGPU/gfx10_asm_vop2.s
+4,663-4,662llvm/test/MC/AMDGPU/gfx10_asm_vop3.s
+3,429-3,426llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-sve-instructions.s
+5,392-849llvm/test/CodeGen/RISCV/fpclamptosat.ll
+34,426-29,8761,408 files not shown
+82,457-48,3221,414 files

LLVM/project 2757a0dllvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s, llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt

Merge branch 'users/chapuni/mcdc/nest/lnot' into users/chapuni/mcdc/nest/covgen

Conflicts:
        clang/lib/CodeGen/CoverageMappingGen.cpp
DeltaFile
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+22,708-22,884llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+22,276-22,275llvm/test/MC/AMDGPU/gfx8_asm_vopc.s
+193,355-193,5263,785 files not shown
+1,251,589-1,119,1073,791 files

LLVM/project 91c1022llvm/test/Transforms/NaryReassociate/AMDGPU nary-add-uniform.ll

[NaryReassociate][AMDGPU] Pre-commit test for uniformity-aware reassociation (NFC)
DeltaFile
+319-0llvm/test/Transforms/NaryReassociate/AMDGPU/nary-add-uniform.ll
+319-01 files

LLVM/project 017a27cllvm/docs AMDGPUUsage.rst

[AMDGPU][Docs] Document amdgpu-expand-waitcnt-profiling attribute (#175750)

DeltaFile
+6-0llvm/docs/AMDGPUUsage.rst
+6-01 files

LLVM/project 3267b91llvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s, llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt

Merge branch 'users/chapuni/mcdc/nest/expect' into users/chapuni/mcdc/nest/trunk

Conflicts:
        clang/docs/ReleaseNotes.rst
DeltaFile
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+22,708-22,884llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+22,276-22,275llvm/test/MC/AMDGPU/gfx8_asm_vopc.s
+193,355-193,5264,814 files not shown
+1,332,566-1,166,0694,820 files

LLVM/project 7af6abdllvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.struct.ptr.atomic.buffer.load.ll llvm.amdgcn.struct.atomic.buffer.load.ll

[AMDGPU][GlobalISel] Add RegBankLegalize support for amd_gcn_end_cf (#175118)

DeltaFile
+162-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.ptr.atomic.buffer.load.ll
+162-60llvm/test/CodeGen/AMDGPU/llvm.amdgcn.struct.atomic.buffer.load.ll
+144-54llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.atomic.buffer.load.ll
+144-54llvm/test/CodeGen/AMDGPU/llvm.amdgcn.raw.ptr.atomic.buffer.load.ll
+9-6llvm/test/CodeGen/AMDGPU/lds-global-non-entry-func.ll
+3-1llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+624-2351 files not shown
+625-2367 files

LLVM/project 560881ellvm/test/CodeGen/RISCV fpclamptosat.ll, llvm/test/MC/AMDGPU gfx10_asm_vopc_e64.s gfx10_asm_vop1.s

Merge branch 'main' into users/chapuni/mcdc/nest/expect
DeltaFile
+10,845-10,844llvm/test/MC/AMDGPU/gfx10_asm_vopc_e64.s
+5,425-5,424llvm/test/MC/AMDGPU/gfx10_asm_vop1.s
+4,672-4,671llvm/test/MC/AMDGPU/gfx10_asm_vop2.s
+4,663-4,662llvm/test/MC/AMDGPU/gfx10_asm_vop3.s
+3,429-3,426llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-sve-instructions.s
+5,392-849llvm/test/CodeGen/RISCV/fpclamptosat.ll
+34,426-29,8761,408 files not shown
+82,457-48,3221,414 files

LLVM/project 566b7dbllvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s, llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt

Merge branch 'users/chapuni/mcdc/nest/lnot' into users/chapuni/mcdc/nest/expect

Conflicts:
        clang/include/clang/AST/IgnoreExpr.h
        clang/lib/CodeGen/CodeGenFunction.cpp
DeltaFile
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+22,708-22,884llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+22,276-22,275llvm/test/MC/AMDGPU/gfx8_asm_vopc.s
+193,355-193,5263,786 files not shown
+1,251,604-1,119,1083,792 files

LLVM/project 57528aallvm/lib/Target/AMDGPU SIRegisterInfo.td, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll load-constant-i1.ll

[AMDGPU] Have VCC as a first-class member of the SGPR pool.

Add VCC and tuples using VCC to SGPR register classes.

We already support VCC as an allocatable register for 32-bit SGPR
operands, so it seems most natural to support it for register
tuple operands as well.

s106/s107 are still not allowed as aliases of vcc_lo/hi in
AsmParser.

The names given to the VCC tuples match those produced by SP3,
though it feels like there is room for improvement.

https://github.com/llvm/llvm-project/issues/62651
DeltaFile
+7,698-7,710llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+729-735llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+259-255llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+245-249llvm/test/CodeGen/AMDGPU/scc-clobbered-sgpr-to-vmem-spill.ll
+55-79llvm/test/CodeGen/AMDGPU/spill-scavenge-offset.ll
+48-21llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+9,034-9,0498 files not shown
+9,147-9,07314 files

LLVM/project 86c3531llvm/lib/Target/AMDGPU GCNRegPressure.cpp

[Review] Use unified vgpr count with unified-register-file
DeltaFile
+2-2llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+2-21 files

LLVM/project 4b26751llvm/test/CodeGen/AMDGPU swdev-549940.ll

Remove undef from test (it still preserves the test behavour before and after the fix)
DeltaFile
+1-1llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+1-11 files

LLVM/project 61e1985llvm/lib/Target/AMDGPU GCNRegPressure.cpp GCNRegPressure.h

[Review] Change consturctor of RegExcess to take a pressure and a target and rename spillsToMemory to spillsToMemoryForTargetOccupancy
DeltaFile
+14-8llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+5-0llvm/lib/Target/AMDGPU/GCNRegPressure.h
+5-0llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+24-83 files

LLVM/project 36858f6llvm/lib/Target/AMDGPU GCNRegPressure.cpp

[Review] Move the  class into an annonymous namespace
DeltaFile
+2-0llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+2-01 files

LLVM/project 90c62e2llvm/lib/Target/AMDGPU GCNRegPressure.cpp GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir swdev-549940.ll

[AMDGPU] Rematerialize VGPR candidates when SGPR spills to VGPR over the VGPR limit

Before, when selecting candidates to rematerialize, we would only
consider SGPR candidates when there was an excess of SGPR registers.

Failing to eliminate the excess would result in spills to VGPRs.
This is normally not an issue, unless spilling to VGPRs results in
excess VGPRs.

This patch does 2 things:
* It relaxes the GCNRPTarget success criteria: now we accept regions
  where we spill SGPRs to VGPRs, as long as this does not end up in
  excess VGPRs.
* It changes isSaveBeneficial to consider the excess VGPRs (which
  includes the SGPRs that would be spilled to VGPR).

With these changes, the compiler rematerializes VGPRs when the excess
SGPRs would result in VGPR excess.


    [4 lines not shown]
DeltaFile
+30-30llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+15-9llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+3-1llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+1-1llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+1-0llvm/lib/Target/AMDGPU/GCNRegPressure.h
+50-415 files