LLVM/project e6f0dfbllvm/lib/Target/AMDGPU/Disassembler CMakeLists.txt

AMDGPU: Add TargetParser to disassembler dependencies

Should fix build failure after #203979, but should be reverted
in #204150
DeltaFile
+1-0llvm/lib/Target/AMDGPU/Disassembler/CMakeLists.txt
+1-01 files

LLVM/project 0b0e0b4lldb/test/API/commands/target/create-deps TestTargetCreateDeps.py, lldb/test/API/functionalities/always-run-threads TestAlwaysRunThreadNames.py

[lldb][test] Skip unsupported tests on WebAssembly (#204245)

Mark more tests that rely on features unavailable on wasm32-wasip1 (or
in LLDB's Wasm support): expression evaluation (skipIfWasm), shared
libraries (skipIfTargetDoesNotSupportSharedLibraries), threads
(skipIfTargetDoesNotSupportThreads), and llvm-strip --keep-symbol, which
the Wasm object format doesn't support. Where a test also has supported,
passing cases, the decorator is applied per method.

The "expression" category is already skipped for Wasm, but that only
covers commands/expression/*, where the category is set by a
"categories" file. The tests in this PR live elsewhere and merely use
expression evaluation incidentally, so they aren't in that category and
need skipIfWasm directly.
DeltaFile
+2-1lldb/test/API/lang/c/register_variables/test.c
+1-1lldb/test/API/lang/c/struct_types/TestStructTypes.py
+2-0lldb/test/API/python_api/unnamed_symbol_lookup/TestUnnamedSymbolLookup.py
+1-0lldb/test/API/tools/lldb-dap/module/TestDAP_module.py
+1-0lldb/test/API/commands/target/create-deps/TestTargetCreateDeps.py
+1-0lldb/test/API/functionalities/always-run-threads/TestAlwaysRunThreadNames.py
+8-210 files not shown
+18-216 files

LLVM/project e443271clang/lib/Driver/ToolChains Clang.cpp, clang/test/Driver objc-constant-literals.m

[Driver] Re-enable -fobjc-constant-literals by default (#204208)

This reverts 4d154f6ea5eb ([Driver] Disable -fobjc-constant-literals by
default (#195000)), which was a temporary measure to unblock a project
that the original constant-literal change (#185130) broke.

For background on the feature and the discussion that led to disabling
and then re-enabling it, see
https://github.com/llvm/llvm-project/pull/185130.

rdar://179823193
DeltaFile
+4-4clang/test/Driver/objc-constant-literals.m
+1-1clang/lib/Driver/ToolChains/Clang.cpp
+5-52 files

LLVM/project 790dee3clang/test/Analysis/Scalable/PointerFlow multi-dim-pointer-flow-constraint.test

[SSAF][WPA] Add a lit test for the WPA improvement of #198889 (#204018)

This commit adds a lit test, which is an example of the issue solved by
#198889 and was discovered independently when applying the analysis to a
real project.

rdar://179754164
DeltaFile
+41-0clang/test/Analysis/Scalable/PointerFlow/multi-dim-pointer-flow-constraint.test
+41-01 files

LLVM/project 4f8ee48llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

rebase

Created using spr 1.3.8-beta.1
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3697,614 files not shown
+712,306-458,3407,620 files

LLVM/project 6d4d7eallvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3697,614 files not shown
+712,311-458,3457,620 files

LLVM/project 107e314llvm/test/CodeGen/AMDGPU llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll, llvm/test/CodeGen/X86 fptosi-sat-vector-512.ll fptoui-sat-vector-512.ll

rebase after factoring out cleanup commit

Created using spr 1.3.8-beta.1
DeltaFile
+7,323-0llvm/test/CodeGen/X86/fptosi-sat-vector-512.ll
+6,132-0llvm/test/CodeGen/X86/fptoui-sat-vector-512.ll
+5,788-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll
+4,289-1,259llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+3,840-1,215llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,473-0llvm/test/CodeGen/X86/fptosi-sat-vector-256.ll
+30,845-2,475681 files not shown
+57,240-7,629687 files

LLVM/project 9230b21llvm/test/CodeGen/AMDGPU llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll, llvm/test/CodeGen/X86 fptosi-sat-vector-512.ll fptoui-sat-vector-512.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+7,323-0llvm/test/CodeGen/X86/fptosi-sat-vector-512.ll
+6,132-0llvm/test/CodeGen/X86/fptoui-sat-vector-512.ll
+5,788-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.mfma.scale.f32.32x32x64.f8f6f4.ll
+4,289-1,259llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+3,840-1,215llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+3,473-0llvm/test/CodeGen/X86/fptosi-sat-vector-256.ll
+30,845-2,475689 files not shown
+57,307-7,694695 files

LLVM/project 9006a2cllvm/lib/Target/RISCV RISCVRegisterInfo.td, llvm/test/MC/RISCV rv32c-invalid.s rv64c-invalid.s

[RISC-V][MC] Improve the diagnostic for invalid compressed register number

Instead of a generic `invalid operand for instruction`, print
`register must be a GPR from x8 to x15` instead.

Pull Request: https://github.com/llvm/llvm-project/pull/204237
DeltaFile
+12-12llvm/test/MC/RISCV/rv32c-invalid.s
+4-4llvm/test/MC/RISCV/rv64c-invalid.s
+2-2llvm/test/MC/RISCV/rvc-hints-invalid.s
+2-2llvm/test/MC/RISCV/xqcibm-invalid.s
+1-0llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+21-205 files

LLVM/project 6b52ab2llvm/lib/Target/RISCV RISCVInstrInfoC.td RISCVInstrInfoXqci.td

[RISC-V] Rename GPRCMem operand to BasePtrC. NFC

This is in preparation for https://github.com/llvm/llvm-project/pull/177073
where these operands can refer to either a GPR or YGPR depending on the
current HwMode.

Pull Request: https://github.com/llvm/llvm-project/pull/204241
DeltaFile
+30-30llvm/lib/Target/RISCV/RISCVInstrInfoC.td
+25-25llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+18-18llvm/lib/Target/RISCV/RISCVInstrInfoZc.td
+16-16llvm/lib/Target/RISCV/RISCVInstrInfoXwch.td
+6-6llvm/lib/Target/RISCV/RISCVInstrInfoZclsd.td
+1-1llvm/lib/Target/RISCV/RISCVInstrInfo.td
+96-966 files

LLVM/project 6e8c2dcclang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat TUSummary.cpp LUSummaryEncoding.cpp, clang/unittests/ScalableStaticAnalysisFramework EntityLinkerTest.cpp

Revert "Reland "[clang][ssaf] Track target triple in TU and LU summaries"" (#204236)

Reverts llvm/llvm-project#204218

Fails amdgpu buildbots
DeltaFile
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/TUSummary.cpp
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/LUSummaryEncoding.cpp
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/LUSummary.cpp
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/TUSummaryEncoding.cpp
+9-12clang/unittests/ScalableStaticAnalysisFramework/EntityLinkerTest.cpp
+0-16clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/JSONFormatImpl.cpp
+13-124170 files not shown
+175-566176 files

LLVM/project 3aea0a0llvm/utils/gn/secondary/clang/lib/CodeGen BUILD.gn

[gn build] Port 930a46d23799 (#204244)
DeltaFile
+1-0llvm/utils/gn/secondary/clang/lib/CodeGen/BUILD.gn
+1-01 files

LLVM/project 4fd2a5cllvm/lib/BinaryFormat Magic.cpp, llvm/unittests/BinaryFormat TestFileMagic.cpp

[BinaryFormat] Fix UBSan negative-shift in identify_magic Mach-O path (#204122)

`identify_magic()` read the Mach-O filetype field at offset 12 by
left-shifting four bytes of the input StringRef:
```
    type = Magic[12] << 24 | Magic[13] << 12 | Magic[14] << 8 | Magic[15];
```
`StringRef::operator[]` returns a `signed char`. When a header byte has
its
high bit set (e.g. 0xFA == -6 after promotion to int), the expression
"-6 << 24" is undefined behavior; even positive bytes like 0xFF promote
to a value whose "<< 24" overflows a signed int. UBSan trapped this on a
crafted input found by lldb-target-fuzzer:

```
    Magic.cpp:177:26: runtime error: left shift of negative value -6
    SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior
    libFuzzer: deadly signal
```

    [26 lines not shown]
DeltaFile
+2-2llvm/lib/BinaryFormat/Magic.cpp
+4-0llvm/unittests/BinaryFormat/TestFileMagic.cpp
+6-22 files

LLVM/project 1915af8llvm/lib/Target/AMDGPU/Disassembler AMDGPUDisassembler.cpp, llvm/test/MC/AMDGPU amdgcn_target_directive_from_eflags.s

AMDGPU: Teach disassembler to produce target id directives (#203979)

Inspect the binary's e_flags to reproduce the .amdgcn_target directive.
This is a step towards round-trip disassembly without depending
on command line state specifying the subtarget. I wasn't sure
where to put the emission to ensure it is always emitted. I
also do not know why it's OK to just write to outs(), but that's
what the other directives here were doing.

Co-Authored-By: Claude Opus 4.6 <noreply at anthropic.com>
DeltaFile
+72-0llvm/test/MC/AMDGPU/amdgcn_target_directive_from_eflags.s
+53-0llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+4-4llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s
+4-4llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx10.s
+3-3llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-vgpr.s
+3-3llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx90a.s
+139-1410 files not shown
+161-2616 files

LLVM/project c8d735dlibcxx/include/__numeric pstl.h transform_reduce.h, libcxx/test/libcxx/numerics nodiscard.verify.cpp

[libc++][numeric] Applied `[[nodiscard]]` to `<numeric>` (#202770)

https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant

Applied `[[nodiscard]]` to the remaining functions.

Towards: https://github.com/llvm/llvm-project/issues/172124

This PR also eliminates
libcxx/test/libcxx/numerics/nodiscard.verify.cpp, which was misplaced
and contained unrelated tests. These tests were moved to where they
belong. This refactoring is included in this PR as the change is also
related to `<numeric>`.

Reference:
- https://eel.is/c++draft/numeric.ops.overview
- https://eel.is/c++draft/numeric.ops

---------

Co-authored-by: Hristo Hristov <zingam at outlook.com>
DeltaFile
+131-0libcxx/test/libcxx/numerics/numeric.ops/nodiscard.verify.cpp
+0-48libcxx/test/libcxx/numerics/nodiscard.verify.cpp
+6-6libcxx/include/__numeric/pstl.h
+4-4libcxx/include/__numeric/transform_reduce.h
+4-4libcxx/include/__numeric/reduce.h
+2-2libcxx/include/__numeric/accumulate.h
+147-643 files not shown
+155-689 files

LLVM/project 0677ebellvm/lib/Transforms/Vectorize VPlanRecipes.cpp

[VPlan] Use getSingleUser to improve code (NFC) (#203882)
DeltaFile
+3-7llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+3-71 files

LLVM/project 94f6b80llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp SIISelLowering.cpp

[AMDGPU] Guard more intrinsics with target features
DeltaFile
+1-51llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+0-42llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+0-24llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+15-2llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-store.ll
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-load.ll
+24-12712 files not shown
+45-14318 files

LLVM/project 53695bbllvm/lib/Target/RISCV RISCVInstrInfoC.td RISCVInstrInfoXqci.td

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+30-30llvm/lib/Target/RISCV/RISCVInstrInfoC.td
+25-25llvm/lib/Target/RISCV/RISCVInstrInfoXqci.td
+18-18llvm/lib/Target/RISCV/RISCVInstrInfoZc.td
+16-16llvm/lib/Target/RISCV/RISCVInstrInfoXwch.td
+6-6llvm/lib/Target/RISCV/RISCVInstrInfoZclsd.td
+1-1llvm/lib/Target/RISCV/RISCVInstrInfo.td
+96-966 files

LLVM/project e9faee6lldb/packages/Python/lldbsuite/test dotest.py

[lldb][test] Skip watchpoint and expression tests on WebAssembly (#204235)

WebAssembly has no watchpoint support (Process/wasm reports no
watchpoints; the stop reason comes back as a plain signal) and cannot
JIT or interpret expressions (ProcessWasm sets CanJIT to false). Teach
the existing per-platform category checks about wasm so the whole
"watchpoint" and "expression" categories are skipped, rather than
decorating each test individually.
DeltaFile
+15-0lldb/packages/Python/lldbsuite/test/dotest.py
+15-01 files

LLVM/project 1190787clang/lib/CodeGen CodeGenAction.cpp, llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

[RFC][CodeGen] Add generic target feature checks for intrinsics

This PR adds target-independent infrastructure for annotating LLVM intrinsics
with required subtarget feature expressions.

It introduces a TargetFeatures string field to intrinsic TableGen records.
TableGen emits an intrinsic-to-feature mapping table.

Both SelectionDAG and GlobalISel now perform this check before lowering target
intrinsics. This allows targets to opt in by annotating intrinsic definitions
directly, rather than adding custom checks during lowering, legalization, or
instruction selection.

This PR uses one AMDGPU intrinsic as an example.
DeltaFile
+96-3llvm/lib/MC/MCSubtargetInfo.cpp
+37-0clang/lib/CodeGen/CodeGenAction.cpp
+36-0llvm/lib/IR/DiagnosticInfo.cpp
+33-1llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp
+28-0llvm/test/TableGen/intrinsic-target-features.td
+25-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+255-414 files not shown
+391-920 files

LLVM/project 7e26ecbllvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp SIISelLowering.cpp

[AMDGPU] Guard more intrinsics with target features
DeltaFile
+1-51llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+0-42llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+0-24llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+14-0llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-load.ll
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-store.ll
+23-12512 files not shown
+44-14118 files

LLVM/project b2c9be7clang/lib/CodeGen CodeGenAction.cpp, llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

[RFC][CodeGen] Add generic target feature checks for intrinsics

This PR adds target-independent infrastructure for annotating LLVM intrinsics
with required subtarget feature expressions.

It introduces a TargetFeatures string field to intrinsic TableGen records.
TableGen emits an intrinsic-to-feature mapping table.

Both SelectionDAG and GlobalISel now perform this check before lowering target
intrinsics. This allows targets to opt in by annotating intrinsic definitions
directly, rather than adding custom checks during lowering, legalization, or
instruction selection.

This PR uses one AMDGPU intrinsic as an example.
DeltaFile
+96-3llvm/lib/MC/MCSubtargetInfo.cpp
+37-0clang/lib/CodeGen/CodeGenAction.cpp
+36-0llvm/lib/IR/DiagnosticInfo.cpp
+33-1llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp
+32-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+28-0llvm/test/TableGen/intrinsic-target-features.td
+262-414 files not shown
+398-920 files

LLVM/project cab48fallvm/lib/Target/RISCV RISCVRegisterInfo.td, llvm/test/MC/RISCV rv32c-invalid.s rv64c-invalid.s

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+12-12llvm/test/MC/RISCV/rv32c-invalid.s
+4-4llvm/test/MC/RISCV/rv64c-invalid.s
+2-2llvm/test/MC/RISCV/rvc-hints-invalid.s
+2-2llvm/test/MC/RISCV/xqcibm-invalid.s
+1-0llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+21-205 files

LLVM/project 4033370clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat LUSummaryEncoding.cpp TUSummary.cpp, clang/unittests/ScalableStaticAnalysisFramework EntityLinkerTest.cpp

Revert "Reland "[clang][ssaf] Track target triple in TU and LU summaries" (#2…"

This reverts commit 9434d4ab865319c443826c2eb408329d0011dc71.
DeltaFile
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/LUSummaryEncoding.cpp
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/TUSummary.cpp
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/LUSummary.cpp
+1-24clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/TUSummaryEncoding.cpp
+9-12clang/unittests/ScalableStaticAnalysisFramework/EntityLinkerTest.cpp
+0-16clang/lib/ScalableStaticAnalysisFramework/Core/Serialization/JSONFormat/JSONFormatImpl.cpp
+13-124170 files not shown
+175-566176 files

LLVM/project fee56f1llvm/lib/Target/AMDGPU/AsmParser AMDGPUAsmParser.cpp, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.cpp AMDGPUBaseInfo.h

AMDGPU: Refactor AMDGPUTargetID to not store MCSubtargetInfo

Store the triple string and GPUKind instead. The dependence
on checking AMDHSA seems like an anti-feature, but maintain the
behavior of not printing the modifiers for other OSes. Start
parsing the target ID instead of performing a direct string
comparison. Also improve test coverage for the treatment of the
environment component of the triple. The main behavioral change
is this will now produce normalized triples in the output and
diagnostics. Practially, this means all of the places that
currently emit "--" will be expanded into "-unknown-".

Co-Authored-By: Claude Opus 4.6 <noreply at anthropic.com>
DeltaFile
+79-36llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+36-10llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+27-1llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+16-0llvm/test/MC/AMDGPU/amdgcn-target-directive-triple-env.s
+5-5llvm/test/MC/AMDGPU/hsa-diag-v4.s
+4-4llvm/test/MC/AMDGPU/isa-version-pal.s
+167-5615 files not shown
+197-7621 files

LLVM/project 4e5d348llvm/lib/Target/AMDGPU/Disassembler AMDGPUDisassembler.cpp, llvm/test/MC/AMDGPU amdgcn_target_directive_from_eflags.s

AMDGPU: Teach disassembler to produce target id directives

Inspect the binary's e_flags to reproduce the .amdgcn_target directive.
This is a step towards round-trip disassembly without depending
on command line state specifying the subtarget. I wasn't sure
where to put the emission to ensure it is always emitted. I
also do not know why it's OK to just write to outs(), but that's
what the other directives here were doing.

Co-Authored-By: Claude Opus 4.6 <noreply at anthropic.com>
DeltaFile
+72-0llvm/test/MC/AMDGPU/amdgcn_target_directive_from_eflags.s
+53-0llvm/lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp
+4-4llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx10.s
+4-4llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s
+3-3llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-vgpr.s
+3-3llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-sgpr.s
+139-1410 files not shown
+161-2616 files

LLVM/project e8b2205llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU vgpr-excess-threshold-percent.ll vgpr-excess-threshold-percent-invalid.ll

[AMDGPU] Add flag to control VGPR pressure limits (#203797)

The RP trackers don't accurately measure the RA problem, and can
underestimate the number of registers required. Currently, for VGPR
pressure, we account for these inaccuracies using VGPRLimitBias, and
ErrorMargin. These are used to reduce the VGPRCriticalLimit /
VGPRExcessLimit . During scheduling, we check RP against these limits,
and if we start to see RP exceeding these limits, we will trigger RP
reduction heuristics (when deciding which instructions to schedule
next). Thus VGPRLimitBias + ErrorMargin effectively reduce the amount of
allowable RP during scheduling, as a means to compensate for RP tracker
inaccuracies. Currently, ErrorMargin is set to 3, and VGPRLimitBias is
set to 0.

However, the degree of inaccuracy tends to scale with the number of
registers we have available for allocation. In other words, the RP
trackers inaccuracy is better expressed as a percent of the register
budget, rather than some literal value. This PR adds some functionality
to express this inaccuracy compensation is a percent - and exposes a

    [7 lines not shown]
DeltaFile
+194-0llvm/test/CodeGen/AMDGPU/vgpr-excess-threshold-percent.ll
+44-2llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+13-0llvm/test/CodeGen/AMDGPU/vgpr-excess-threshold-percent-invalid.ll
+251-23 files

LLVM/project b3d0487llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp SIInstrInfo.h

[AMDGPU] NFC: Obviously show isVALU includes LDSDMA instructions (#203548)

In https://reviews.llvm.org/D124472 we started labelling LDSDMA as VALU
-- this was due to SPG stating that these instructions act as both
memory + VALU instructions.

This is buried in the isVALU methods - I'd argue that most users without
knowledge of this characteristic would not expect this behavior, and
looking at the implementation of these methods, there is nothing that
would suggest this behavior. This PR forces users to confront this
characteristic and decide if that is what they want to do for their
usecase.

I've personally seen at least two bugs in upstream code caused by this,
and have seen it cause problems a dozen + times in downstream code / in
WIP things.
DeltaFile
+61-49llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+19-7llvm/lib/Target/AMDGPU/SIInstrInfo.h
+11-8llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-5llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+4-4llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
+3-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+101-767 files not shown
+114-8713 files

LLVM/project 698648fllvm/include/llvm/Transforms/Vectorize/SandboxVectorizer DependencyGraph.h Scheduler.h, llvm/lib/Transforms/Vectorize/SandboxVectorizer DependencyGraph.cpp Scheduler.cpp

[SandboxVec][DAG] Implement UnscheduledPreds API (#201240)

Mirroring UnscheduledSuccs, this patch adds an UnscheduledPreds DAG node
counter that counts how many predecessors are not scheduled yet.

It also renames the existing ready() to readyBottomUp() to help us
differentiate between the two variants that are now available.
DeltaFile
+145-9llvm/unittests/Transforms/Vectorize/SandboxVectorizer/DependencyGraphTest.cpp
+26-4llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.h
+20-4llvm/lib/Transforms/Vectorize/SandboxVectorizer/DependencyGraph.cpp
+7-0llvm/unittests/Transforms/Vectorize/SandboxVectorizer/SchedulerTest.cpp
+3-3llvm/lib/Transforms/Vectorize/SandboxVectorizer/Scheduler.cpp
+1-1llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/Scheduler.h
+202-216 files

LLVM/project 1b4d463llvm/test/ExecutionEngine/MCJIT frem.ll, llvm/tools/lli CMakeLists.txt

[MCJIT] Fix frem.ll test failure with LLVM_ENABLE_RPMALLOC on Windows (#200319)

When compiled with `LLVM_ENABLE_RPMALLOC`, `lli.exe` links statically to
the runtime. With `LLVM_EXPORT_SYMBOLS_FOR_PLUGINS` enabled, `lli.exe`
exports a subset of symbols from the runtime library, but not all. In
particular, `printf()` is exported from the application binary, but
`fflush()` and `exit()` are not. For a JITted module, unresolved
external symbols are loaded either from the application or dynamic
libraries, in this case, from `msvcrt.dll`. The `MCJIT/frem.ll` test
attempts to flush the output, but because the functions resolve to
different CRT instances, the output data is lost.

The patch avoids the test failure by disabling exporting symbols from
`lli.exe` when it is linked with the static runtime library.
DeltaFile
+15-1llvm/tools/lli/CMakeLists.txt
+0-2llvm/test/ExecutionEngine/MCJIT/frem.ll
+15-32 files