LLVM/project 0717455llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp AMDGPU.td, llvm/test/CodeGen/AMDGPU wmma-hazards-gfx1250-w32.mir wmma-coexecution-valu-hazards.mir

[AMDGPU] Handle gfx1251 wmma hazard

Generic target affected too in a pessimistic way.
DeltaFile
+1,537-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+895-0llvm/test/CodeGen/AMDGPU/wmma-coexecution-valu-hazards.mir
+42-0llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1251-w32.mir
+31-8llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+7-1llvm/lib/Target/AMDGPU/AMDGPU.td
+2,512-95 files

LLVM/project 2edc546clang/include/clang/Basic BuiltinsAMDGPUDocs.td BuiltinsAMDGPU.td, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Builtin support for wmma_f64_16x16x4_f64
DeltaFile
+19-0clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1251-wmma-w32.cl
+17-0clang/test/SemaOpenCL/builtins-amdgcn-error-gfx1251-wmma-w32-param.cl
+15-0clang/include/clang/Basic/BuiltinsAMDGPUDocs.td
+6-0clang/include/clang/Basic/BuiltinsAMDGPU.td
+5-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+2-2clang/test/CodeGenCXX/dynamic-cast-address-space.cpp
+64-23 files not shown
+68-49 files

LLVM/project fe76f30llvm/lib/Target/AMDGPU SISchedule.td GCNProcessors.td

[AMDGPU] Add gfx1251 speed model

Adjust generic speed model to account for a slowest.
DeltaFile
+60-5llvm/lib/Target/AMDGPU/SISchedule.td
+2-2llvm/lib/Target/AMDGPU/GCNProcessors.td
+62-72 files

LLVM/project e5ff461llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUISelDAGToDAG.cpp

[AMDGPU] Intrinsic and codegen for wmma_f64_16x16x4_f64
DeltaFile
+145-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1251.w32.ll
+144-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imm.gfx1251.w32.ll
+61-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.gfx1251.w32.ll
+23-0llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
+7-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll
+5-1llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+385-12 files not shown
+390-18 files

LLVM/project 097a528llvm/lib/Target/AMDGPU VOP3PInstructions.td AMDGPU.td, llvm/test/MC/AMDGPU gfx1251_asm_wmma_w32.s gfx1251_asm_wmma_w32_err.s

[AMDGPU] MC support for v_wmma_f64_16x16x4_f64
DeltaFile
+49-0llvm/test/MC/AMDGPU/gfx1251_asm_wmma_w32.s
+29-0llvm/test/MC/Disassembler/AMDGPU/gfx1251_dasm_wmma_w32.txt
+7-0llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+7-0llvm/test/MC/AMDGPU/gfx1251_asm_wmma_w32_err.s
+5-0llvm/lib/Target/AMDGPU/AMDGPU.td
+97-05 files

LLVM/project 91b9f3fllvm/lib/Target/AMDGPU EvergreenInstructions.td AMDGPUInstrInfo.td

AMDGPU: Remove AMDGPUbfm

It wasn't actually used. We select [SV]_BFM_B32 by directly matching
shift-based patterns.

commit-id:b5cd6327
DeltaFile
+1-4llvm/lib/Target/AMDGPU/EvergreenInstructions.td
+0-3llvm/lib/Target/AMDGPU/AMDGPUInstrInfo.td
+1-2llvm/lib/Target/AMDGPU/SOPInstructions.td
+2-93 files

FreeBSD/ports c6d5546misc/py-anthropic distinfo Makefile

misc/py-anthropic: update to 0.109.1
DeltaFile
+3-3misc/py-anthropic/distinfo
+1-1misc/py-anthropic/Makefile
+4-42 files

LLVM/project 1272df2llvm/test/Transforms/SLPVectorizer/X86 runtime-alias-checks.ll

[SLP][NFC] Add tests with non-movable calls, NFC



Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/203140
DeltaFile
+240-1llvm/test/Transforms/SLPVectorizer/X86/runtime-alias-checks.ll
+240-11 files

OpenBSD/ports dEYkKn0databases/duckdb Makefile distinfo, databases/duckdb/patches patch-CMakeLists_txt

   databases/duckdb: update to 1.5.3

   with feedback from sthen@, ok rsadowski@
VersionDeltaFile
1.7+4-4databases/duckdb/Makefile
1.4+4-3databases/duckdb/patches/patch-CMakeLists_txt
1.7+2-2databases/duckdb/distinfo
1.6+1-0databases/duckdb/pkg/PLIST
+11-94 files

FreeBSD/ports 4bc8af5games/veloren-weekly distinfo Makefile

games/veloren-weekly: update to s20260610

Changes:        https://gitlab.com/veloren/veloren/-/compare/addd09fb764...30dc4ff7f7
(cherry picked from commit 138cf1688788ba93cbd00a35be679a88fe1fb9e4)
DeltaFile
+3-3games/veloren-weekly/distinfo
+2-2games/veloren-weekly/Makefile
+5-52 files

FreeBSD/ports 138cf16games/veloren-weekly distinfo Makefile

games/veloren-weekly: update to s20260610

Changes:        https://gitlab.com/veloren/veloren/-/compare/addd09fb764...30dc4ff7f7
DeltaFile
+3-3games/veloren-weekly/distinfo
+2-3games/veloren-weekly/Makefile
+5-62 files

FreeBSD/ports 49db657graphics/mesa-devel distinfo Makefile, graphics/mesa-devel/files patch-suffix

graphics/mesa-devel: update to 26.1.b.2867

Changes:        https://gitlab.freedesktop.org/mesa/mesa/-/compare/095e4f5f1bb...fd616bab71a
DeltaFile
+8-2graphics/mesa-devel/files/patch-suffix
+3-3graphics/mesa-devel/distinfo
+2-3graphics/mesa-devel/Makefile
+13-83 files

FreeBSD/ports 8d3ed72benchmarks/clpeak distinfo Makefile

benchmarks/clpeak: update to 2.0.12

Changes:        https://github.com/krrishnarraj/clpeak/releases/tag/2.0.11
Changes:        https://github.com/krrishnarraj/clpeak/releases/tag/2.0.12
Reported by:    GitHub (watch releases)
DeltaFile
+3-3benchmarks/clpeak/distinfo
+1-1benchmarks/clpeak/Makefile
+4-42 files

NetBSD/src 1lEjdDCshare/man/man4/man4.evbarm awge.4

   awge.4: add definite article
VersionDeltaFile
1.5+2-1share/man/man4/man4.evbarm/awge.4
+2-11 files

NetBSD/src rcfeU2Nsys/arch/sgimips/hpc hpc.c

   Remove magic constants set in the cx56 driver's descriptor
   in favour of the hpcreg.h definitions.

   Confusingly, the MI driver apparently swaps the meanings of DO
   and DI in the chip's datasheet.
VersionDeltaFile
1.74+8-9sys/arch/sgimips/hpc/hpc.c
+8-91 files

LLVM/project 3c7cea8llvm/include/llvm/Target/GlobalISel Combine.td, llvm/test/CodeGen/AArch64/GlobalISel combine-or-and-xor.ll combine-or-and-xor.mir

Revert "[GlobalISel] Add `or_and_xor_to_or` pattern from SelectionDAG" (#203136)

Reverts llvm/llvm-project#201108
DeltaFile
+0-213llvm/test/CodeGen/AArch64/GlobalISel/combine-or-and-xor.ll
+0-206llvm/test/CodeGen/AArch64/GlobalISel/combine-or-and-xor.mir
+1-40llvm/include/llvm/Target/GlobalISel/Combine.td
+1-4593 files

NetBSD/src eHBQwSGsys/arch/sgimips/hpc hpcreg.h, sys/arch/sgimips/sgimips arcemu.h

   Fix EEPROM reading on IP12: DELAY isn't available as early as
   needed, so roll our own using the most pessimistic timings
   (that is, busy loop enough for the fastest pre-ARCS CPU).

   Clean up the EEPROM twiddling code and remove magic constants
   while here.
VersionDeltaFile
1.15+48-29sys/arch/sgimips/sgimips/arcemu.h
1.22+8-1sys/arch/sgimips/hpc/hpcreg.h
+56-302 files

LLVM/project 8b625b2llvm/test/MC/RISCV rv32c-invalid.s xqcibm-invalid.s

[RISC-V] Add --implicit-check-not="error:" to a few tests

Ensures that the test checks for every error emitted by llvm-mc. To do this
we have to move the CHECK lines to the next line rather than the same line
since otherwise we get a false-positive match.

This adds a few missing CHECK line in the xqcibm-invalid test and is needed
to minimize the diff in one of my subsequent commit.

Pull Request: https://github.com/llvm/llvm-project/pull/203091
DeltaFile
+112-57llvm/test/MC/RISCV/rv32c-invalid.s
+44-23llvm/test/MC/RISCV/xqcibm-invalid.s
+36-19llvm/test/MC/RISCV/rv64c-invalid.s
+20-11llvm/test/MC/RISCV/rvc-hints-invalid.s
+212-1104 files

LLVM/project 9617b2amlir/lib/Dialect/XeGPU/Transforms XeGPUSgToLaneDistribute.cpp, mlir/test/Dialect/XeGPU sg-to-lane-distribute-unit.mlir

[MLIR][XeGPU] Support partial subgroup lane distribution  (#201667)

for convert_layout

Add lowering support in XeGPUSgToLaneDistribute for values that are
distributed across only a fraction of the subgroup.

- SgToLaneConvertLayout now lowers a rank-2 xegpu.convert_layout that
  shrinks the lane layout along the outer (distributed) dimension while
  keeping lane_data unchanged (e.g. [16, 1] -> [8, 1]). The partial-subgroup
  case is detected directly in the pattern: equal order, rank 2, unit inner
  lane layout, and a genuinely distributed outer lane layout (> 1, which also
  rules out the degenerate [1, 1] layout). Because the data is no longer
  replicated in every lane, it is gathered across lanes and the distributed
  outer dimension is doubled when the lane count is halved.

- The cross-lane gather is factored into a dedicated helper,
  shuffleDataAsLaneLayoutChange(): it bitcasts the source to i32, issues
  gpu.shuffle up to fetch the values from the dropped lanes, and concatenates

    [9 lines not shown]
DeltaFile
+112-6mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToLaneDistribute.cpp
+68-0mlir/test/Dialect/XeGPU/sg-to-lane-distribute-unit.mlir
+180-62 files

LLVM/project afe5014llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

rebase, update name of internal feature flag

Created using spr 1.3.8-beta.1
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3695,547 files not shown
+533,116-396,8755,553 files

LLVM/project 47a1d53llvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,281-12,374llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,084-164,3695,540 files not shown
+533,158-396,8445,546 files

OpenBSD/ports 38n79d4devel/goreleaser distinfo modules.inc

   Update to goreleaser-2.16.0

   From Laurent Cheylus (maintainer)
VersionDeltaFile
1.30+1,418-1,454devel/goreleaser/distinfo
1.30+543-562devel/goreleaser/modules.inc
1.35+1-1devel/goreleaser/Makefile
+1,962-2,0173 files

NetBSD/src hBG6xM1sys/arch/sgimips/conf files.sgimips, sys/arch/sgimips/sgimips prom_vectors.S prom.c

   Ensure that curlwp is saved and restored across calls into
   pre-ARCS PROM routines. This only matters for console output,
   but do the same for shutdown/restart entrypoints as well.

   While here, preformat strings before invoking the PROM's
   printf routine, since we don't know what its formatting and
   argument limitations are.
VersionDeltaFile
1.1+99-0sys/arch/sgimips/sgimips/prom_vectors.S
1.1+38-0sys/arch/sgimips/sgimips/prom.c
1.1+25-0sys/arch/sgimips/sgimips/prom.h
1.25+3-8sys/arch/sgimips/sgimips/arcemu.c
1.55+3-1sys/arch/sgimips/conf/files.sgimips
+168-95 files

LLVM/project 0417b78llvm/lib/Target/AMDGPU AMDGPURewriteAGPRCopyMFMA.cpp, llvm/test/CodeGen/AMDGPU rewrite-vgpr-mfma-to-agpr-spill-joint-dom-mir.mir rewrite-vgpr-mfma-to-agpr-spill-joint-dom.ll

[2/2][AMDGPU] Insert IMPLICIT_DEF to provide a reaching def for unspilled reloads

Depends on https://github.com/llvm/llvm-project/pull/198472

PR #198472 skips unspilling a slot if a spill reload is reachable from
entry along a path that does not contain a spill store. This patch builds
on it by finding a basic block where an IMPLICIT_DEF can be inserted to
provide a reaching definition on all paths to such reloads, allowing the
unspill to proceed. This new def may extend the rewritten vreg's live
range, so extra interference checks are performed over the extended region
to pick an appropriate physical register.

For the joint-dominance tests, an IMPLICIT_DEF insertion block is found,
but no physical register is interference-free over the extended range,
so the unspill is conservatively skipped.

Assisted-by: Cursor/Claude Opus
DeltaFile
+101-10llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+9-4llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-spill-joint-dom-mir.mir
+9-4llvm/test/CodeGen/AMDGPU/rewrite-vgpr-mfma-to-agpr-spill-joint-dom.ll
+119-183 files

LLVM/project af2e3e7lldb/test/API/macosx/thread-names TestInterruptThreadNames.py

Revert "[lldb][test] Increase polling in TestInterruptThreadNames.py (#201554)" (#203126)

This reverts commit fdfd1c1344187d64b63504ea8e3662ae4936503a.

The Intel mac CI bot is timing out often with these new timeouts and
we're getting failing runs. Raphael will adjust and re-land.
DeltaFile
+9-2lldb/test/API/macosx/thread-names/TestInterruptThreadNames.py
+9-21 files

LLVM/project e408c75mlir/lib/Conversion/XeGPUToXeVM XeGPUToXeVM.cpp, mlir/test/Conversion/XeGPUToXeVM loadstore_nd.mlir

[MLIR][XeGPU] Extend 8-bit load_nd support in XeVM lowering (#201645)

2D block load on 8bit element type has a shape 32x16 supported by OpenCL
API
```
 void intel_sub_group_2d_block_read_transform_8b_32r16x1c(   // reads eight uints
    global void* base_address,
    int width, int height, int pitch, int2 coord, private uint* destination);
```
The API is for load with transform/VNNI request.
OpenCL does not provide a load API for the same vector type and no
transform request. But value returned is identical for this special
vector type. <32x16x"8b">
The PR adds support for this vector type with no transform request.
DeltaFile
+27-0mlir/test/Conversion/XeGPUToXeVM/loadstore_nd.mlir
+10-1mlir/lib/Conversion/XeGPUToXeVM/XeGPUToXeVM.cpp
+37-12 files

LLVM/project 990543bmlir/lib/Dialect/XeGPU/Transforms XeGPUPeepHoleOptimizer.cpp XeGPUArrayLengthOptimization.cpp, mlir/test/Dialect/XeGPU peephole-optimize.mlir

[MLIR][XeGPU] Enable peephole optimization for the CRI target (#201655)

Enable the XeGPU transpose peephole and array-length optimizations for
the Crescent Island (cri) target alongside pvc and bmg. Skip sub-byte (<
8-bit) element types in array-length optimizations, which are not yet
supported.

Add tests in peephole-optimize.mlir covering the cri target and the
array-length optimization rejecting sub-byte
DeltaFile
+61-0mlir/test/Dialect/XeGPU/peephole-optimize.mlir
+9-5mlir/lib/Dialect/XeGPU/Transforms/XeGPUPeepHoleOptimizer.cpp
+3-0mlir/lib/Dialect/XeGPU/Transforms/XeGPUArrayLengthOptimization.cpp
+73-53 files

FreeBSD/src 64b053ftests/sys/posixshm memfd_test.c

memfd_test: skip hugetlb testcase when large page requests are not supported

Fixes this CI test failure: https://ci.freebsd.org/view/Test/job/FreeBSD-main-riscv64-test/16606/testReport/junit/sys.posixshm/memfd_test/hugetlb/

Reviewed by:    kevans
MFC after:      3 days
Differential Revision:  https://reviews.freebsd.org/D57289
DeltaFile
+5-2tests/sys/posixshm/memfd_test.c
+5-21 files

LLVM/project f5e3252llvm/include/llvm/Passes PassBuilder.h, llvm/lib/Passes PassBuilder.cpp

[Passes] Enhance `--print-pipeline-passes` (#202892)

Allow users to specify output format, make pipeline output more
palatable to FileCheck. Currently, it only support `text` and `tree`
format.

Fixes #200926.
DeltaFile
+412-367llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+367-347llvm/test/CodeGen/X86/llc-pipeline-npm.ll
+58-4llvm/lib/Passes/PassBuilder.cpp
+19-1llvm/include/llvm/Passes/PassBuilder.h
+14-0llvm/test/tools/opt/print-pipeline-passes.ll
+2-1llvm/tools/llc/NewPMDriver.cpp
+872-7201 files not shown
+873-7217 files

LLVM/project 0a82646clang/docs SanitizerSpecialCaseList.rst ReleaseNotes.rst, llvm/lib/Support SpecialCaseList.cpp

[SpecialCaseList] Add backward compatible dot-slash handling

This PR is preparation for:
* https://github.com/llvm/llvm-project/pull/167283

The new behavior is controlled by the `Version` field in the special
case list file.

- Version 1 and 2: Path is matched as-is, regardless of presence of "./".
- Version 3, 5 and higher: Paths with leading dot-slash are canonicalized
  to paths without dot-slash before matching. This means that a rule
  like `src=./foo` will never match, and `src=foo` will match both
  `foo` and `./foo`. (Version 3 never became default but has this behavior).
- Version 4: Transitionary version. Paths are matched both ways
  (canonicalized and non-canonicalized) to maintain backward compatibility.
  If a match only works with the old behavior (non-canonicalized), a warning
  is emitted.

This change allows for a gradual transition to the new behavior, while

    [6 lines not shown]
DeltaFile
+49-1llvm/unittests/Support/SpecialCaseListTest.cpp
+42-6llvm/lib/Support/SpecialCaseList.cpp
+21-0clang/docs/SanitizerSpecialCaseList.rst
+8-0clang/docs/ReleaseNotes.rst
+120-74 files