LLVM/project 0b49adcllvm/lib/Target/AMDGPU AMDGPUMachineFunctionInfo.cpp AMDGPUMachineFunction.cpp

[AMDGPU] Rename AMDGPUMachineFunction to AMDGPUMachineFunctionInfo. NFC. (#187276)

This is derived from MachineFunctionInfo not MachineFunction.
DeltaFile
+237-0llvm/lib/Target/AMDGPU/AMDGPUMachineFunctionInfo.cpp
+0-235llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
+0-137llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.h
+125-0llvm/lib/Target/AMDGPU/AMDGPUMachineFunctionInfo.h
+6-6llvm/lib/Target/AMDGPU/SIMachineFunctionInfo.cpp
+5-4llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+373-38216 files not shown
+399-40722 files

LLVM/project fce100ellvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize predicated-multiple-exits.ll predicated-early-exits-interleave.ll

[VPlan] Fix masked_cond expansion.

masked_cond is used to combine early-exit conditions with masks from
predicate. The early-exit condition should only be evaluated if the mask
is true. Emit the mask first, to avoid incorrect poison propagation.

Fixes https://github.com/llvm/llvm-project/issues/187061.
DeltaFile
+24-24llvm/test/Transforms/LoopVectorize/predicated-multiple-exits.ll
+8-8llvm/test/Transforms/LoopVectorize/predicated-early-exits-interleave.ll
+5-5llvm/test/Transforms/LoopVectorize/predicated-single-exit.ll
+1-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+38-384 files

LLVM/project d1e625cclang-tools-extra/docs ReleaseNotes.rst, clang-tools-extra/docs/clang-tidy/checks/bugprone unchecked-optional-access.rst

[clang-tidy] `bugprone-unchecked-optional-access`: Add support for GTest asserts like `ASSERT_TRUE` and `ASSERT_FALSE` (#186363)

Resolves  https://github.com/llvm/llvm-project/issues/181737

Addresses false positives reported in
https://github.com/llvm/llvm-project/issues/181737 .

This PR is heavily inspired by
https://github.com/llvm/llvm-project/pull/170947 .

Many thanks to @fmayer for the prior work.

---------

Co-authored-by: EugeneZelenko <eugene.zelenko at gmail.com>
DeltaFile
+90-0clang/lib/Analysis/FlowSensitive/Models/UncheckedOptionalAccessModel.cpp
+67-0clang/unittests/Analysis/FlowSensitive/UncheckedOptionalAccessModelTest.cpp
+19-0clang-tools-extra/docs/clang-tidy/checks/bugprone/unchecked-optional-access.rst
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+181-04 files

LLVM/project aeff312llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir

[AMDGPU] Remastered 1 test now that TargetOccupancy is clamped.
DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+2-21 files

LLVM/project 5a5c317mlir/include/mlir-c/Target ExportSMTLIB.h, mlir/include/mlir/Target/SMTLIB ExportSMTLIB.h

[MLIR][Python] Add optional emit reset to exportSMTLIB (#187366)

Previously, the MLIR's python binding `smt.export_smtlib(...)` always
emit `(reset)` to the end of smtlib string as a solver terminator.
This PR added an option to suppress this trailing, as downstream users
like python z3 module don't need it.
DeltaFile
+12-14mlir/lib/CAPI/Target/ExportSMTLIB.cpp
+9-8mlir/lib/Bindings/Python/DialectSMT.cpp
+8-3mlir/test/CAPI/smt.c
+4-4mlir/include/mlir-c/Target/ExportSMTLIB.h
+2-1mlir/lib/Target/SMTLIB/ExportSMTLIB.cpp
+2-0mlir/include/mlir/Target/SMTLIB/ExportSMTLIB.h
+37-306 files

LLVM/project 360fab6llvm/lib/Target/RISCV RISCVSchedSiFive7.td RISCVInstrPredicates.td, llvm/test/tools/llvm-mca/RISCV/Inputs mul-div-rv32.s

[RISCV] Fix IDiv/IRem scheduling data for RV32 cores that use the SiFive7 model (#187331)

The integer division and remainder instructions on a 32-bit core that
uses SiFive7 scheduling model should have the same latency and
throughput as its word counterparts on a 64-bit SiFive7 core.

This patch fixes those scheduling entries by adding a new SchedPred that
predicates on `Feature64Bit` to toggle the SchedVariant that is attached
on the affected integer division / remainder instructions.
DeltaFile
+59-0llvm/test/tools/llvm-mca/RISCV/SiFive7/mul-div-rv32.test
+15-6llvm/lib/Target/RISCV/RISCVSchedSiFive7.td
+10-0llvm/test/tools/llvm-mca/RISCV/Inputs/mul-div-rv32.s
+4-0llvm/lib/Target/RISCV/RISCVInstrPredicates.td
+88-64 files

LLVM/project 2be4a9bllvm/test/Transforms/LoopVectorize predicated-multiple-exits.ll

[LV] Add predicated early-exit tests showing poison prop issue. (NFC)

Add tests showing incorrect poison propagation from
https://github.com/llvm/llvm-project/issues/187061.
DeltaFile
+135-0llvm/test/Transforms/LoopVectorize/predicated-multiple-exits.ll
+135-01 files

LLVM/project d226f1bllvm/test/CodeGen/AMDGPU memory-legalizer-flat-singlethread.ll memory-legalizer-private-workgroup.ll

[AMDGPU] Regenerate codegen tests to check extra stuff at end of line (#187325)

Regenerate checks after two recent commits that caused extra stuff to be
added at the end of assembly lines, so the existing checks did not fail.

- #179414 added "nv" to loads and stores on GFX1250.
- #185774 added "msbs" comments on setreg instructions.
DeltaFile
+336-336llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-singlethread.ll
+336-336llvm/test/CodeGen/AMDGPU/memory-legalizer-private-workgroup.ll
+336-336llvm/test/CodeGen/AMDGPU/memory-legalizer-private-wavefront.ll
+336-336llvm/test/CodeGen/AMDGPU/memory-legalizer-private-singlethread.ll
+336-336llvm/test/CodeGen/AMDGPU/memory-legalizer-local-workgroup.ll
+336-336llvm/test/CodeGen/AMDGPU/memory-legalizer-local-wavefront.ll
+2,016-2,016150 files not shown
+12,910-12,686156 files

LLVM/project 6c371eelibc/src/__support/annex_k libc_constraint_handler.h

fix format
DeltaFile
+1-1libc/src/__support/annex_k/libc_constraint_handler.h
+1-11 files

LLVM/project b39cdc0libc/config/linux/aarch64 entrypoints.txt, libc/config/linux/x86_64 entrypoints.txt

[libc][stdlib][annex_k] Add ignore_handler_s.
DeltaFile
+22-0libc/src/stdlib/ignore_handler_s.h
+16-0libc/src/stdlib/ignore_handler_s.cpp
+13-0libc/src/stdlib/CMakeLists.txt
+9-0libc/include/stdlib.yaml
+2-1libc/config/linux/x86_64/entrypoints.txt
+1-0libc/config/linux/aarch64/entrypoints.txt
+63-12 files not shown
+65-18 files

LLVM/project d3328e6libc/config/linux/aarch64 entrypoints.txt, libc/config/linux/x86_64 entrypoints.txt

[libc][stdlib][annex_k] Add set_constraint_handler_s.
DeltaFile
+28-0libc/src/stdlib/set_constraint_handler_s.cpp
+21-0libc/src/stdlib/set_constraint_handler_s.h
+11-0libc/src/stdlib/CMakeLists.txt
+7-0libc/include/stdlib.yaml
+1-0libc/config/linux/x86_64/entrypoints.txt
+1-0libc/config/linux/aarch64/entrypoints.txt
+69-01 files not shown
+70-07 files

LLVM/project db0a7balibc/include stdlib.yaml, libc/src/__support/annex_k abort_handler_s.h CMakeLists.txt

[libc][annex_k] Add abort_handler_s.
DeltaFile
+43-0libc/src/__support/annex_k/abort_handler_s.h
+22-0libc/src/stdlib/abort_handler_s.h
+20-0libc/src/stdlib/abort_handler_s.cpp
+14-2libc/include/stdlib.yaml
+12-0libc/src/__support/annex_k/CMakeLists.txt
+10-0libc/src/stdlib/CMakeLists.txt
+121-25 files not shown
+126-211 files

LLVM/project 34310ablibc/src/__support/annex_k constraint_macros.h CMakeLists.txt

[libc][annex_k] Add libc_constraint_handler macros.
DeltaFile
+44-0libc/src/__support/annex_k/constraint_macros.h
+9-0libc/src/__support/annex_k/CMakeLists.txt
+53-02 files

LLVM/project 4b6a61dlibc/src/__support/annex_k libc_constraint_handler.h CMakeLists.txt

[libc][annex_k] Add libc_constraint_handler.
DeltaFile
+26-0libc/src/__support/annex_k/libc_constraint_handler.h
+9-0libc/src/__support/annex_k/CMakeLists.txt
+35-02 files

LLVM/project d1f6cb9libc/include/llvm-libc-types CMakeLists.txt

change location
DeltaFile
+1-2libc/include/llvm-libc-types/CMakeLists.txt
+1-21 files

LLVM/project 96299d8flang-rt/test lit.site.cfg.py.in, flang-rt/test/Driver safe-trampoline-gnustack.f90

[flang] Disable trampoline test for PPC (NFC) (#187194)
DeltaFile
+1-0flang-rt/test/lit.site.cfg.py.in
+1-0flang-rt/test/Driver/safe-trampoline-gnustack.f90
+2-02 files

LLVM/project af23906libc/hdr/types constraint_handler_t.h CMakeLists.txt, libc/include CMakeLists.txt stdlib.yaml

[libc][annex_k] Add constraint_handler_t.
DeltaFile
+21-0libc/include/llvm-libc-types/constraint_handler_t.h
+18-0libc/hdr/types/constraint_handler_t.h
+9-0libc/hdr/types/CMakeLists.txt
+2-0libc/include/llvm-libc-types/CMakeLists.txt
+1-0libc/include/CMakeLists.txt
+1-0libc/include/stdlib.yaml
+52-06 files

LLVM/project d4b86e5llvm/include/llvm/Analysis IVUsers.h, llvm/lib/Transforms/Scalar LoopStrengthReduce.cpp

[LSR] skip ephemeral IV users when collecting IV chains (#187282)

IVUsers records ephemeral values used only by `llvm.assume` as IV
operands in the Processed set. As a result, `CollectChains` picks them
up and builds unnecessary increment chains. Fix this by checking
`IVUsers::isEphemeral` before collecting the chains.

Fixes #187270
DeltaFile
+67-0llvm/test/Transforms/LoopStrengthReduce/X86/iv-chain-assume-ephemeral.ll
+4-0llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
+2-0llvm/include/llvm/Analysis/IVUsers.h
+73-03 files

LLVM/project 170b87dlibc/hdr/types errno_t.h CMakeLists.txt, libc/include/llvm-libc-types CMakeLists.txt errno_t.h

apply suggestions from code review
DeltaFile
+14-2libc/hdr/types/errno_t.h
+1-1libc/include/llvm-libc-types/CMakeLists.txt
+2-0libc/include/llvm-libc-types/errno_t.h
+0-1libc/hdr/types/CMakeLists.txt
+17-44 files

LLVM/project 403ae32libc/hdr/types rsize_t.h CMakeLists.txt, libc/include/llvm-libc-types rsize_t.h CMakeLists.txt

[libc][annex_k] Add rsize_t.
DeltaFile
+23-0libc/hdr/types/rsize_t.h
+18-0libc/include/llvm-libc-types/rsize_t.h
+8-0libc/hdr/types/CMakeLists.txt
+1-0libc/include/llvm-libc-types/CMakeLists.txt
+50-04 files

LLVM/project 43cec7bmlir/tools/mlir-tblgen OpPythonBindingGen.cpp EnumPythonBindingGen.cpp

refactor to use external stoarge
DeltaFile
+9-6mlir/tools/mlir-tblgen/OpPythonBindingGen.cpp
+2-2mlir/tools/mlir-tblgen/EnumPythonBindingGen.cpp
+11-82 files

LLVM/project c630b09clang/lib/CIR/CodeGen CIRGenExprScalar.cpp

[CIR][NFC] Remove NYI checks in ternary with cleanup (#186870)

We added those checks when CleanupScopeOp is used to emit an error
message in this edge case until we fix it. Now it's already fixed, and
we don't need to keep the NYI
DeltaFile
+0-8clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+0-81 files

LLVM/project 50fcaffllvm/lib/Target/AMDGPU GCNSubtarget.h GCNSchedStrategy.cpp, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp AMDGPUBaseInfo.h

[AMDGPU] Updated getMaxNumAGPRs to use getMaxNumVectorRegs.

Removed other variants of getMaxNumAGPRs. So with this patch, there
is only one way to get the maximum number of AGPRs. If the client
provides a target occupancy, that value will be used. Otherwise,
the function level attributes for waves-per-eu are used. In both the
cases, the utility uses getMaxNumVectorRegs.
DeltaFile
+330-330llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+0-19llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+4-3llvm/lib/Target/AMDGPU/GCNSubtarget.h
+0-7llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+2-2llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+2-2llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+338-3636 files

LLVM/project 55719b6llvm/test/CodeGen/AMDGPU llvm.amdgcn.mfma.ll llvm.amdgcn.sched.group.barrier.ll

[AMDGPU] Adds AGPR pressure during candidate init in GCN scheduler.

Scheduling heuristics automatically will consider AGPR pressure.
AGPRExcessLimit and AGPRCriticalLimit are added. Some of the VGPR
bias and error limits are reused. Helpers added mostly mirror the
existing VGPR logic. A ConsiderAGPR boolean controls whether AGPRs
should at all be factored in during candidate initialization, e.g.
on targets with allocatable AGPRs.

Verified that updated LIT tests use AGPRs.

Originally Authored-by: Nicholas Baron
(https://github.com/llvm/llvm-project/pull/150288)

Modified-by: Dhruva Chakrabarti

Assisted-by: Cursor
DeltaFile
+546-687llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.ll
+390-374llvm/test/CodeGen/AMDGPU/llvm.amdgcn.sched.group.barrier.ll
+337-388llvm/test/CodeGen/AMDGPU/mfma-cd-select.ll
+181-181llvm/test/CodeGen/AMDGPU/agpr-csr.ll
+120-112llvm/test/CodeGen/AMDGPU/mfma-no-register-aliasing.ll
+69-72llvm/test/CodeGen/AMDGPU/spill-agpr.ll
+1,643-1,8149 files not shown
+1,850-1,95915 files

LLVM/project 03f488allvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp, llvm/test/MC/AArch64 global-tagging.ll

[AsmPrinter][MTE] Support memtag-globals for all AArch64 targets (#187065)

This change ensures that all AArch64 targets can use memtag globals, not
only Android.
DeltaFile
+7-3llvm/test/MC/AArch64/global-tagging.ll
+2-2llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+9-52 files

LLVM/project f441b9dllvm/lib/Target/AMDGPU GCNSubtarget.cpp GCNSubtarget.h

[AMDGPU][NFC] If outside range, clamp target occupancy to nearest endpoint.
DeltaFile
+9-5llvm/lib/Target/AMDGPU/GCNSubtarget.cpp
+4-1llvm/lib/Target/AMDGPU/GCNSubtarget.h
+13-62 files

LLVM/project da47edellvm/include/llvm/CodeGen RegisterScavenging.h, llvm/lib/Target/AArch64 AArch64FrameLowering.cpp

[AArch64] Fix register scavenger crash when merging MTE stack tags (#186934)

When `-sanitize=memtag-stack` is enabled, `TagStoreEdit::emitLoop`
optimizes contiguous ST2Gi instructions into an STGloop. Because this
runs during PEI (post-register allocation), it spawns two new virtual
registers: BaseReg and SizeReg.

Under high register pressure (e.g., Swift async continuation thunks
where almost all registers are kept live), the Register Scavenger must
rely on emergency spill slots to assign physical registers to BaseReg
and SizeReg.

Previously, the compiler assumed at most one emergency spill slot was
needed. If PEI found an unused Callee-Saved Register (`ExtraCSSpill`),
it bypassed allocating an emergency slot entirely. If no CSRs were free,
it allocated exactly one slot. Because STGloop requires TWO scratch
locations, the scavenger would crash trying to fulfill the second
allocation.


    [11 lines not shown]
DeltaFile
+551-0llvm/test/CodeGen/AArch64/memtag-stg-loop-reg-pressure.mir
+28-0llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+2-0llvm/include/llvm/CodeGen/RegisterScavenging.h
+581-03 files

LLVM/project 7f1be73llvm/include/llvm/Remarks RemarkStreamer.h, llvm/lib/CodeGen/AsmPrinter AsmPrinter.cpp

[spr] initial version

Created using spr 1.3.8-wip
DeltaFile
+5-6llvm/lib/Remarks/RemarkStreamer.cpp
+5-5llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+3-3llvm/test/CodeGen/X86/remarks-section.ll
+3-1llvm/include/llvm/Remarks/RemarkStreamer.h
+16-154 files

LLVM/project 003a819llvm/lib/CodeGen MachineCopyPropagation.cpp, llvm/test/CodeGen/X86 machine-copy-prop.mir

[MCP] Never eliminate frame-setup/destroy instructions

Presumably targets only insert frame instructions which are significant,
and there may be effects MCP doesn't model. Similar to reserved registers this
is probably overly conservative, but as this causes no codegen change in
any lit test I think it is benign.

The motivation is just to clean up #183149 for AMDGPU, as we can spill
to physical registers, and currently have to spill the EXEC mask purely
to enable debug-info.

Change-Id: I9ea4a09b34464c43322edd2900361bf635efd9f7
DeltaFile
+22-0llvm/test/CodeGen/X86/machine-copy-prop.mir
+2-2llvm/lib/CodeGen/MachineCopyPropagation.cpp
+24-22 files

LLVM/project d597e9allvm/lib/CodeGen MachineCopyPropagation.cpp

[MCP][NFC] Opinionated refactoring

There are a few minor inconsistencies across the pass which I found mildly
distracting:

* The use of `Def`/`Dest`/`Dst` to refer to the same thing
* Inconsistent declaration order of `Dst`/`Src` vs `Src`/`Dst`
* Lots of `->getReg()->asMCReg()`, and uses of `Register` when the pass
  is always running after RA anyway.
* Some places explicitly `assert(isCopyInstr)` while others just deref
  the `optional`.

Standardize on `Dst`/`Src` to match the metaphor and ordering of
`DestSourcePair`.

Assume `std::optional::operator*` will assert in any reasonable
implementation, even though this may technically be undefined behavior.
When asserts are disabled it would be anyway.


    [11 lines not shown]
DeltaFile
+164-195llvm/lib/CodeGen/MachineCopyPropagation.cpp
+164-1951 files