LLVM/project 0fdc8fcllvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Add comment

Change-Id: I2180bba631fe4a01ed3c3fbcfa8c19cbefa84133
DeltaFile
+1-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+1-01 files

LLVM/project d462e46llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

clang-format

Change-Id: I534b1a979f55339a814ef3416c2492252845add5
DeltaFile
+6-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+6-31 files

LLVM/project 35372bcllvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Make fence heuristic work bottom-up

Change-Id: I629cbc8905b87a962e8b123287e5f60a3154df6b
DeltaFile
+19-17llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+19-171 files

LLVM/project ad4d966llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Add back tryLatency

Change-Id: I12d4f255c48ed77ba927eb3b192e5903f1f5e24f
DeltaFile
+7-1llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+7-11 files

LLVM/project aa5ebd8llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

Adrress comments from https://github.com/llvm/llvm-project/pull/188658

Change-Id: Ia94c567a753168c1ffa16dc5d91195e7dd0ba044
DeltaFile
+114-114llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+3-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+117-1172 files

LLVM/project 74a99d7llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.h

Add a comment

Change-Id: I447f7f1fb185b18924cfd98249b5a0a05fef2484
DeltaFile
+7-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+7-01 files

LLVM/project d538e7bllvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp AMDGPUCoExecSchedStrategy.h, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

[AMDGPU] Add MemoryPipeline scheduling to Coexec sched

Change-Id: I52c476834155823d1ba998cdbbcb3ad6a7e6f2f5
DeltaFile
+323-0llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+77-23llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+18-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+418-233 files

LLVM/project 2673e59llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Use static_cast

Change-Id: Ibec2cf245d5ac213ef0cc4292ba80cb983a58692
DeltaFile
+12-7llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+12-71 files

LLVM/project 7c043b7mlir/include/mlir/Dialect/XeGPU/IR XeGPUTypes.td, mlir/lib/Conversion/VectorToXeGPU VectorToXeGPU.cpp

[MLIR][XeGPU][VectorToXeGPU] Fixed lowering of transfer_read/write for rank > 2 (#193308)

If rank > 2, load gather/store scatter are used.
Increased value type rank to 8.
DeltaFile
+56-37mlir/test/Conversion/VectorToXeGPU/transfer-read-to-xegpu.mlir
+43-21mlir/test/Conversion/VectorToXeGPU/transfer-write-to-xegpu.mlir
+4-2mlir/lib/Conversion/VectorToXeGPU/VectorToXeGPU.cpp
+2-2mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
+105-624 files

LLVM/project 0dc6e8cllvm/lib/Target/AMDGPU/MCTargetDesc AMDGPUMCExpr.cpp

[AMDGPU][NFC] Refactor TryGetMCExprValue into evaluateMCExprs helper (#193859)

Replace the duplicated `TryGetMCExprValue` lambda in
`evaluateExtraSGPRs`, `evaluateTotalNumVGPR`, `evaluateAlignTo`, and
`evaluateOccupancy` with a shared static helper `evaluateMCExprs` that
takes an `initializer_list` of `uint64_t` references, enabling callers
to write:

```cpp
uint64_t VCCUsed, FlatScrUsed, XNACKUsed;
if (!evaluateMCExprs(Args, Asm, {VCCUsed, FlatScrUsed, XNACKUsed}))
  return false;
```

Split out from #192306 per reviewer feedback.

This PR was created with the help of Github Copilot Claude Opus.

---------

Co-authored-by: Copilot <copilot at github.com>
DeltaFile
+22-50llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCExpr.cpp
+22-501 files

LLVM/project 7e04ce9lldb/test/API/functionalities/multi-breakpoint TestMultiBreakpoint.py

fixup! address review feedback
DeltaFile
+12-0lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+12-01 files

FreeNAS/freenas f0437b6src/middlewared/middlewared/plugins/network_ global_config.py, tests/api2 test_service_announcement.py

Fix tests
DeltaFile
+33-8tests/api2/test_service_announcement.py
+22-9src/middlewared/middlewared/plugins/network_/global_config.py
+55-172 files

NetBSD/pkgsrc-wip 69465bddnsdist Makefile COMMIT_MSG, powerdns PLIST

powerdns, powerdns-recursor, dnsdist: remove, updated in pkgsrc
DeltaFile
+0-299powerdns-recursor/distinfo
+0-100powerdns-recursor/cargo-depends.mk
+0-75dnsdist/Makefile
+0-61powerdns/PLIST
+0-53dnsdist/COMMIT_MSG
+0-51powerdns-recursor/Makefile
+0-63925 files not shown
+3-1,04631 files

LLVM/project 0a83196llvm/test/CodeGen/AMDGPU schedule-amdgpu-tracker-physreg.ll

[AMDGPU] Adjusted GCN tracker option after rebasing on top of
users/dhruvachak/add_physical_to_gcn_trackers_after_rename.
DeltaFile
+1-1llvm/test/CodeGen/AMDGPU/schedule-amdgpu-tracker-physreg.ll
+1-11 files

LLVM/project be81a90llvm/test/CodeGen/AMDGPU bf16.ll minimumnum.bf16.ll

[AMDGPU] Regenerated tests after rebasing.
DeltaFile
+8,626-9,213llvm/test/CodeGen/AMDGPU/bf16.ll
+2,364-2,564llvm/test/CodeGen/AMDGPU/minimumnum.bf16.ll
+2,314-2,514llvm/test/CodeGen/AMDGPU/maximumnum.bf16.ll
+1,340-1,343llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,244-1,250llvm/test/CodeGen/AMDGPU/llvm.exp10.f64.ll
+1,109-1,102llvm/test/CodeGen/AMDGPU/llvm.exp.f64.ll
+16,997-17,98660 files not shown
+26,853-27,93166 files

LLVM/project 93ba1d4llvm/test/CodeGen/AMDGPU call-argument-types.ll amdgcn.bitcast.768bit.ll

[AMDGPU] Regenerated tests after rebasing on top of
users/dhruvachak/add_physical_to_gcn_trackers_after_rename.
DeltaFile
+284-572llvm/test/CodeGen/AMDGPU/call-argument-types.ll
+227-225llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.768bit.ll
+180-196llvm/test/CodeGen/AMDGPU/gfx-callable-return-types.ll
+171-143llvm/test/CodeGen/AMDGPU/agpr-copy-no-free-registers.ll
+131-80llvm/test/CodeGen/AMDGPU/bf16.ll
+88-94llvm/test/CodeGen/AMDGPU/a-v-global-atomicrmw.ll
+1,081-1,3109 files not shown
+1,477-1,70115 files

LLVM/project 38fdc71llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

[AMDGPU] Enabled GCN trackers (amdgpu-use-amdgpu-trackers) by default.

The LIT tests have been generally updated in one of the following ways:
(1) If the above option was not present and the test was auto-generated,
the test has now been auto-generated.
(2) If the above option was not present and the test was not
auto-generated, added the option -amdgpu-use-amdgpu-trackers=0 so as to
preserve any specific attributes the test was already checking.
(3) If the above option was present in a test, then its value has been
updated to reflect the change in the default.

Currently, there are 4 tests in category (2). They are:
CodeGen/AMDGPU/
  addrspacecast.ll
  schedule-regpressure-limit.ll
  schedule-regpressure-limit2.ll
  sema-v-unsched-bundle.ll

There are 8 tests in category (3). They are:

    [15 lines not shown]
DeltaFile
+77,782-77,355llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+13,255-13,280llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+9,928-9,400llvm/test/CodeGen/AMDGPU/bf16.ll
+4,484-4,395llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+3,842-3,812llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.256bit.ll
+3,802-3,690llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+113,093-111,932155 files not shown
+169,016-166,668161 files

LLVM/project 1f80bc7llvm/lib/Target/AMDGPU GCNRegPressure.cpp

clang-format fix.
DeltaFile
+3-4llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+3-41 files

LLVM/project 6744fdellvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Add comment

Change-Id: I2180bba631fe4a01ed3c3fbcfa8c19cbefa84133
DeltaFile
+1-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+1-01 files

LLVM/project 8203e76llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.h

Add a comment

Change-Id: I447f7f1fb185b18924cfd98249b5a0a05fef2484
DeltaFile
+7-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+7-01 files

LLVM/project 64b74c4llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

clang-format

Change-Id: I534b1a979f55339a814ef3416c2492252845add5
DeltaFile
+6-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+6-31 files

LLVM/project 2bfdfe2llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Make fence heuristic work bottom-up

Change-Id: I629cbc8905b87a962e8b123287e5f60a3154df6b
DeltaFile
+19-17llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+19-171 files

LLVM/project 6edc82allvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp AMDGPUCoExecSchedStrategy.h, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

[AMDGPU] Add MemoryPipeline scheduling to Coexec sched

Change-Id: I52c476834155823d1ba998cdbbcb3ad6a7e6f2f5
DeltaFile
+323-0llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+77-23llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+18-0llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+418-233 files

LLVM/project 63e3675llvm/include/llvm/IR IntrinsicsDirectX.td, llvm/test/Transforms/DirectX getpointer-sink-behavior.ll

[DirectX] Denote `dx.resource.getpointer` with `IntrInaccessibleMemOnly` and `IntrReadMem` (#193593)

`IntrConvergent` was originally added to `dx.resource.getpointer` to
prevent optimization passes (`SimplifyCFG`, `GVN`) from sinking the
intrinsic out of control flow branches, which would create phi nodes on
the returned pointer.

Using `IntrInaccessibleMemOnly` and `IntrReadMem` semantics still
prevent passes from merging or sinking identical calls across branches.
However, this allows the call to be moved within a single control flow
path.

Updates relevant tests and adds a new test to demonstrate a now legal
potential optimization.

This was discovered when
https://github.com/llvm/llvm-project/pull/188792 caused the following
failure:
https://github.com/llvm/llvm-project/actions/runs/24577221310/job/71865579618.

    [5 lines not shown]
DeltaFile
+31-0llvm/test/Transforms/DirectX/getpointer-sink-behavior.ll
+3-3llvm/test/Transforms/SimplifyCFG/DirectX/no-sink-dxgetpointer.ll
+3-2llvm/test/Transforms/GVN/no-sink-dxgetpointer.ll
+1-1llvm/include/llvm/IR/IntrinsicsDirectX.td
+38-64 files

LLVM/project 466f439llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Add back tryLatency

Change-Id: I12d4f255c48ed77ba927eb3b192e5903f1f5e24f
DeltaFile
+7-1llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+7-11 files

LLVM/project 3e26f22llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU coexec-sched-effective-stall.mir

Adrress comments from https://github.com/llvm/llvm-project/pull/188658

Change-Id: Ia94c567a753168c1ffa16dc5d91195e7dd0ba044
DeltaFile
+114-114llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+3-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+117-1172 files

LLVM/project df359d8llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 slp-fma-loss.ll

[SLP] Skip FMulAdd conversion for alt-shuffle FAdd/FSub nodes (#193960)

isAddSubLikeOp() admits alt-shuffle nodes that mix FAdd and FSub, so
transformNodes() was marking them with CombinedOp = FMulAdd. The cost
model then priced the node as a single llvm.fmuladd vector intrinsic,
but emission for an alt shuffle still goes through the ShuffleVector
path and produces fmul + fadd + fsub + shufflevector, which the backend
cannot fuse into a single fmuladd. The resulting under-count made SLP
choose the vector form over the scalar form even when the scalar form
lowers to real FMAs (e.g. fmadd + fmsub on AArch64).
DeltaFile
+12-12llvm/test/Transforms/SLPVectorizer/AArch64/slp-fma-loss.ll
+2-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+14-132 files

LLVM/project ccdba25llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp

Remove unused function

Change-Id: I9f2de1497f793d2848dedaf645e21e07a4ba82d6
DeltaFile
+2-62llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+2-621 files

LLVM/project 2be3852llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU coexec-scheduler.ll

Merge conflict

Change-Id: I24f471688f9d0604b45e95a4fa4da85fb0d9ed76
DeltaFile
+23-22llvm/test/CodeGen/AMDGPU/coexec-scheduler.ll
+29-5llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+52-272 files

LLVM/project 9c4fb32llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.h AMDGPUCoExecSchedStrategy.cpp

Formatting

Change-Id: I3d89fba145471141ef945b1de15330caa245e82d
DeltaFile
+4-4llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+4-3llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+8-72 files