LLVM/project 8b10fdfmlir/lib/Dialect/Bufferization/Transforms StaticMemoryPlannerAnalysis.cpp, mlir/test/Dialect/Bufferization/Transforms static-memory-planner-arena-arg.mlir

[mlir][bufferization] Add error for memref return types

Add validation to reject functions with memref return types, as static
memory planning is incompatible with returning memrefs. In allocate mode,
the arena is freed at function exit, making returned memrefs invalid. In
arg mode, returning a memref from the input arena violates typical memory
ownership patterns.

When a function has memref return types, the pass:
1. Emits a clear error message
2. Fails gracefully without transforming the function
3. Preserves the original IR

This prevents silent bugs where returned memrefs would point to freed or
external memory.

Changes:
- Add return type validation at start of runOnOperation()
- Check all result types for MemRefType

    [2 lines not shown]
DeltaFile
+12-0mlir/lib/Dialect/Bufferization/Transforms/StaticMemoryPlannerAnalysis.cpp
+11-0mlir/test/Dialect/Bufferization/Transforms/static-memory-planner-arena-arg.mlir
+23-02 files

LLVM/project f30ab18mlir/include/mlir/Dialect/Bufferization/Transforms Passes.td, mlir/lib/Dialect/Bufferization/Transforms StaticMemoryPlannerAnalysis.cpp

[mlir][bufferization] Add arena-mode pass option (allocate vs arg)

Add arena-mode pass option to control how the shared arena is obtained:
- 'allocate' (default): Creates arena via memref.alloc within the function
- 'arg': Uses function's first argument as the pre-allocated arena

The 'arg' mode is useful when the arena is pre-allocated externally and
passed to the function, enabling use cases like pre-allocated scratch
buffers or memory pools.

In 'arg' mode, the pass validates that:
1. The context is a function operation
2. The function has at least one argument
3. The first argument is memref<...xi8>

If validation fails, the pass emits an error and fails gracefully.

Changes:
- Add arena-mode option to Passes.td with default 'allocate'

    [3 lines not shown]
DeltaFile
+43-12mlir/lib/Dialect/Bufferization/Transforms/StaticMemoryPlannerAnalysis.cpp
+31-18mlir/include/mlir/Dialect/Bufferization/Transforms/Passes.td
+34-0mlir/test/Dialect/Bufferization/Transforms/static-memory-planner-arena-arg.mlir
+108-303 files

LLVM/project 0d2a04cmlir/lib/Dialect/Bufferization/Transforms StaticMemoryPlannerAnalysis.cpp, mlir/test/Dialect/Bufferization/Transforms static-memory-planner-analysis.mlir

[mlir][bufferization] Convert arena to i8 byte buffer with memref.view

Change the arena from typed (e.g., memref<Nxf32>) to a generic i8 byte buffer
(memref<Nxi8>). This allows a single arena to hold allocations of different
element types (f32, i64, i16, etc.).

Use memref.view to create typed views into the i8 arena at computed byte
offsets. This is the standard MLIR pattern for type-agnostic memory buffers.

Changes:
- Arena is now memref<totalSizexi8> instead of element-typed
- Use memref.view instead of memref.subview + reinterpret_cast
- Byte offsets passed directly to memref.view via arith.constant
- Update all tests to reflect i8 arena + view pattern

Example transformation:
  Before: memref.alloc() : memref<1024xf32>
  After:  %arena = memref.alloc() : memref<4096xi8>
          %c0 = arith.constant 0 : index
          %view = memref.view %arena[%c0][] : memref<4096xi8> to memref<1024xf32>
DeltaFile
+38-19mlir/test/Dialect/Bufferization/Transforms/static-memory-planner-analysis.mlir
+23-26mlir/lib/Dialect/Bufferization/Transforms/StaticMemoryPlannerAnalysis.cpp
+61-452 files

LLVM/project ac3e313mlir/lib/Dialect/Bufferization/Transforms StaticMemoryPlannerAnalysis.cpp

[mlir][bufferization] Extract memory planning into pure function

Separate memory planning logic from IR transformation by introducing
trivialMemoryPlanner() - a pure function that computes buffer offsets
without depending on MLIR operations.

Changes:
- Add Alloc structure for allocation-independent planning
- Implement trivialMemoryPlanner(arenaAlignment, allocs) -> offsets
- Refactor runOnOperation() to use the planning function
- Planning logic is now testable independently of MLIR

This architecture enables plugging in different allocation strategies
(firstFit, bestFit) without modifying IR transformation code.
DeltaFile
+60-15mlir/lib/Dialect/Bufferization/Transforms/StaticMemoryPlannerAnalysis.cpp
+60-151 files

LLVM/project 8d00692mlir/lib/Dialect/Bufferization/Transforms StaticMemoryPlannerAnalysis.cpp, mlir/test/Dialect/Bufferization/Transforms static-memory-planner-analysis.mlir

[mlir][bufferization] Add alignment support to static memory planner

Track alignment requirements from memref.alloc operations and ensure
offsets are properly padded to meet alignment constraints. The arena
allocation receives the maximum alignment of all transformed allocations.

Changes:
- Add alignment field to AllocationCandidate structure
- Compute sizes in bytes to handle alignment padding correctly
- Implement alignOffset() helper to pad offsets to alignment boundaries
- Set arena alignment attribute to maximum required alignment
- Add test demonstrating alignment padding with 64 and 128-byte requirements

This ensures correctness for SIMD and other alignment-sensitive operations.
DeltaFile
+44-12mlir/lib/Dialect/Bufferization/Transforms/StaticMemoryPlannerAnalysis.cpp
+28-3mlir/test/Dialect/Bufferization/Transforms/static-memory-planner-analysis.mlir
+72-152 files

LLVM/project 9f4543allvm/lib/Target/AMDGPU AMDGPUHWEvents.h SIInsertWaitcnts.cpp

[AMDGPU][InsertWaitCnts] Make HWEvent a BitMask

Follow up from comments on https://github.com/llvm/llvm-project/pull/202886

Make HWEvent a bitmask by default instead of having both the enum, and a separate HWEventSet. This has the advantage of streamlining the code a bit and opening the possibility of adding "modifiers" to events, e.g. I imagine we could now fold "VMemType" into the Events.
We already do this with things like SMEM_GROUP. At least now it's baked into the design.

I opted for a bit more verbosity by taking inspiration from FastMathFlags (FMF): instead of exposing a raw enum, I wrap it in a class w/ helper function. The downside is having to reimplement all the little bitwise ops, but the result is a cleaner, simpler interface than a raw enum (class) w/ many helper functions. I initially tried that but I recoiled at the sight of things like `contains(A, B)` which isn't very clear, while `A.contains(B)` is self explanatory.

Considering HWEvent is a bitmask, I also implemented a simple iterator to iterate over all set bits of the mask, which is a useful thing to have as some APIs in InsertWaitCnt rely on treating one event at a time.
DeltaFile
+96-94llvm/lib/Target/AMDGPU/AMDGPUHWEvents.h
+73-79llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+28-34llvm/lib/Target/AMDGPU/AMDGPUHWEvents.def
+30-32llvm/lib/Target/AMDGPU/AMDGPUHWEvents.cpp
+227-2394 files

FreeNAS/freenas 9943660debian/debian control

Explicitly list packages that were previously installed only because they were "recommended"
DeltaFile
+1-0debian/debian/control
+1-01 files

LLVM/project 2948907llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.wmma.imod.gfx1251.w32.ll llvm.amdgcn.wmma.gfx1251.w32.ll

AMDGPU/GlobalISel: RegBankLegalize rules for gfx1251 wmma intrinsics (#203558)
DeltaFile
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1251.w32.ll
+2-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.gfx1251.w32.ll
+1-1llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imm.gfx1251.w32.ll
+5-34 files

LLVM/project cae502dllvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange inner-preheader-multi-entry-phi.ll

[LoopInterchange] Reject inner preheader PHIs with non-identical incoming values (#203842)

When the outer loop header branches to the inner loop preheader via duplicate edges (e.g. br i1 %c, label %inner.ph, label %inner.ph), the preheader can contain PHI nodes with more than one incoming entry for the same predecessor. The transform eliminates these PHIs by substituting each with its first incoming value, but the existing assert required exactly one incoming value and would fire on such input.

Relax the assert to accept any PHI where all incoming values are identical. A PHI with distinct values for the same predecessor is rejected by the IR verifier, so only the identical-value case can arise in practice.

Fixes #203466
DeltaFile
+72-0llvm/test/Transforms/LoopInterchange/inner-preheader-multi-entry-phi.ll
+2-2llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+74-22 files

LLVM/project 076695cllvm/test/CodeGen/AMDGPU llvm.log.ll llvm.log10.ll, llvm/test/CodeGen/AMDGPU/GlobalISel sdiv.i64.ll srem.i64.ll

AMDGPU/GlobalISel: Switch some tests to -new-reg-bank-select (#203557)
DeltaFile
+1,134-744llvm/test/CodeGen/AMDGPU/llvm.log.ll
+1,134-744llvm/test/CodeGen/AMDGPU/llvm.log10.ll
+808-685llvm/test/CodeGen/AMDGPU/GlobalISel/sdiv.i64.ll
+806-681llvm/test/CodeGen/AMDGPU/GlobalISel/srem.i64.ll
+555-426llvm/test/CodeGen/AMDGPU/llvm.exp10.ll
+554-425llvm/test/CodeGen/AMDGPU/llvm.exp.ll
+4,991-3,7057 files not shown
+6,219-4,96913 files

LLVM/project 69926b4llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU scalar-float-sop2.ll

AMDGPU/GlobalISel: Fix regBankLegalize rules for uniform cvt_pkrtz (#203283)
DeltaFile
+84-192llvm/test/CodeGen/AMDGPU/scalar-float-sop2.ll
+6-1llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+90-1932 files

LLVM/project 7347524llvm/test/Transforms/LoopInterchange inner-preheader-single-phi.ll

[LoopInterchange][NFC] autogenerate checks in inner-preheader-single-phi (#203843)
DeltaFile
+34-1llvm/test/Transforms/LoopInterchange/inner-preheader-single-phi.ll
+34-11 files

LLVM/project f5b589eclang/test/CodeGen/AArch64/sme2-intrinsics acle_sme2_tmop.c, llvm/include/llvm/IR IntrinsicsAArch64.td

[AArch64][SME] Split FP8 FTMOPA intrinsics (#203310)

Introduce separate FP8 FTMOPA intrinsics for ZA16 and ZA32:

    llvm.aarch64.sme.fp8.ftmopa.za16
    llvm.aarch64.sme.fp8.ftmopa.za32

The FP8 FTMOPA forms need to model their FPMR dependency, so they should
not share the same intrinsic definitions as the non-FP8 FTMOPA forms.

Update the Clang SME builtin definitions and AArch64 instruction
patterns to use the new intrinsics, and add AutoUpgrade support for the
previous FP8-shaped llvm.aarch64.sme.ftmopa.* spellings so existing IR
and bitcode continue to work.

This was split out from #154144 because the intrinsic upgrade needs to
be handled separately to avoid breaking existing bitcode.
DeltaFile
+21-0llvm/test/Bitcode/upgrade-sme2-fp8-intrinsics-tmop.ll
+19-0llvm/lib/IR/AutoUpgrade.cpp
+14-0llvm/include/llvm/IR/IntrinsicsAArch64.td
+4-4clang/test/CodeGen/AArch64/sme2-intrinsics/acle_sme2_tmop.c
+2-2llvm/test/CodeGen/AArch64/sme2-intrinsics-tmop.ll
+2-2llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+62-82 files not shown
+64-108 files

OpenBSD/src YH1QVTWusr.bin/tmux cmd-split-window.c window.c

   Add -B to new-pane to select the floating pane border.
VersionDeltaFile
1.134+21-5usr.bin/tmux/cmd-split-window.c
1.338+13-1usr.bin/tmux/window.c
1.137+7-2usr.bin/tmux/screen-redraw.c
1.1090+6-2usr.bin/tmux/tmux.1
1.217+2-2usr.bin/tmux/options-table.c
1.1352+2-1usr.bin/tmux/tmux.h
+51-136 files

OpenBSD/src DUKcQWBusr.sbin/rpki-client http.c

   Clear last_modified after each response on a persistent HTTP connection

   (In case a later response doesn't contain its own "Last-Modified" header field.)

   Reported by Ties de Kock.

   OK tb@ claudio@
VersionDeltaFile
1.104+4-2usr.sbin/rpki-client/http.c
+4-21 files

OPNSense/plugins 23ecd4fdns/rfc2136 pkg-descr Makefile, dns/rfc2136/src/etc/inc/plugins.inc.d rfc2136.inc

dns/rfc2136: sync with master
DeltaFile
+10-0dns/rfc2136/pkg-descr
+7-3dns/rfc2136/src/www/services_rfc2136_edit.php
+4-1dns/rfc2136/src/etc/inc/plugins.inc.d/rfc2136.inc
+1-2dns/rfc2136/Makefile
+22-64 files

OPNSense/plugins 3ab2cd1security/stunnel Makefile, security/stunnel/src/opnsense/mvc/app/controllers/OPNsense/Stunnel ServicesController.php

security/stunnel: sync with master
DeltaFile
+20-20security/stunnel/src/opnsense/scripts/stunnel/generate_certs.php
+5-12security/stunnel/src/opnsense/mvc/app/models/OPNsense/Stunnel/Stunnel.xml
+1-1security/stunnel/src/opnsense/mvc/app/controllers/OPNsense/Stunnel/ServicesController.php
+1-1security/stunnel/Makefile
+27-344 files

OPNSense/plugins 73ff960www/cache Makefile

www/cache: chicken and egg dependency test
DeltaFile
+1-3www/cache/Makefile
+1-31 files

FreeBSD/src f930d8ausr.sbin/nfsd pnfs.4

pnfs.4: Fix a typo in the manual page

- s/Wihout/Without/

MFC after:      5 days
DeltaFile
+1-1usr.sbin/nfsd/pnfs.4
+1-11 files

FreeBSD/ports 4c18ec5sysutils/mdfried distinfo Makefile.crates

sysutils/mdfried: Update to 0.22.2

- Add option PDF (default ON) to enable support for PDF files
- Add a list of mdfried features to pkg-descr

Reported by:    "github-actions[bot]" <notifications at github.com>
DeltaFile
+75-5sysutils/mdfried/distinfo
+36-1sysutils/mdfried/Makefile.crates
+14-0sysutils/mdfried/pkg-descr
+8-1sysutils/mdfried/Makefile
+133-74 files

LLVM/project 6a0b6a1clang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-2velem.c v8.2a-neon-intrinsics.c

[CIR][AArch64] Lower NEON laneq FMA builtins (#202337)

Lower additional AArch64 NEON laneq fused multiply-accumulate builtins
in CIR.

This covers:
- `BI__builtin_neon_vfmaq_laneq_v`
  - `vfmaq_laneq_f16`
  - `vfmaq_laneq_f32`
  - `vfmaq_laneq_f64`
- `BI__builtin_neon_vfmad_laneq_f64`
  - `vfmad_laneq_f64`

For `vfmaq_laneq_v`, the lowering bitcasts the operands, splats the
selected lane source, and emits the `llvm.fma` intrinsic with the
operand order matching classic AArch64 CodeGen.

For `vfmad_laneq_f64`, the lowering extracts the selected lane from the
`float64x2_t` source and emits scalar `llvm.fma.f64`.

    [7 lines not shown]
DeltaFile
+104-2clang/test/CodeGen/AArch64/neon/fused-multiply.c
+0-76clang/test/CodeGen/AArch64/neon-2velem.c
+24-2clang/test/CodeGen/AArch64/neon/fused-multiple-fullfp16.c
+19-2clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+0-20clang/test/CodeGen/AArch64/v8.2a-neon-intrinsics.c
+0-11clang/test/CodeGen/AArch64/neon-scalar-x-indexed-elem.c
+147-1136 files

FreeBSD/src 10b1a35usr.sbin/mixer mixer.8

mixer.8: Fix a typo in the manual page

- s/thet/the/

MFC after:      5 days
DeltaFile
+1-1usr.sbin/mixer/mixer.8
+1-11 files

LLVM/project 49375feclang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CodeGen/AArch64 neon-intrinsics.c

[CIR][AArch64] Lower NEON subtraction intrinsics (#202857)

### summary

part of : https://github.com/llvm/llvm-project/issues/185382

- Add CIR lowering for the scalar AArch64 NEON subtraction builtins
`vsubd_s64` and `vsubd_u64`.
- Verify that the remaining signed, unsigned, and floating-point
`vsub/vsubq` intrinsics are correctly expanded through arm_neon.h and
emitted as `cir.sub`.
DeltaFile
+266-0clang/test/CodeGen/AArch64/neon/subtraction.c
+1-221clang/test/CodeGen/AArch64/neon-intrinsics.c
+1-0clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+268-2213 files

FreeBSD/src d9e0452usr.sbin/jail jail.8

jail.8: Fix two typos in the manual page

- s/Similarily/Similarly/
- s/passtrough/passthrough/

MFC after:      5 days
DeltaFile
+2-2usr.sbin/jail/jail.8
+2-21 files

OpenBSD/ports nC8n4gPdevel/py-wcwidth distinfo Makefile, devel/py-wcwidth/pkg PLIST

   update to py3-wcwidth-0.8.1
VersionDeltaFile
1.14+74-0devel/py-wcwidth/pkg/PLIST
1.20+2-2devel/py-wcwidth/distinfo
1.33+1-1devel/py-wcwidth/Makefile
+77-33 files

OpenBSD/ports wQWZBtRdevel/py-uharfbuzz Makefile distinfo

   update to py3-uharfbuzz-0.55.0
VersionDeltaFile
1.4+2-2devel/py-uharfbuzz/Makefile
1.4+2-2devel/py-uharfbuzz/distinfo
+4-42 files

OPNSense/plugins def849awww/cache Makefile

www/cache: fix this
DeltaFile
+1-1www/cache/Makefile
+1-11 files

OpenBSD/ports KqG2SPysysutils/firmware/mwx Makefile, sysutils/firmware/mwx/pkg PLIST

   Add mwx(4) MT7920 firmware

   ok phessler@ claudio@
VersionDeltaFile
1.3+3-0sysutils/firmware/mwx/Makefile
1.3+2-0sysutils/firmware/mwx/pkg/PLIST
+5-02 files

OpenBSD/ports cnUweZDaudio/p5-Audio-Scan Makefile distinfo, audio/p5-Audio-Scan/patches patch-Scan_xs

   update to p5-Audio-Scan-1.13
VersionDeltaFile
1.24+5-4audio/p5-Audio-Scan/Makefile
1.9+2-2audio/p5-Audio-Scan/distinfo
1.7+1-1audio/p5-Audio-Scan/pkg/PLIST
1.2+0-0audio/p5-Audio-Scan/patches/patch-Scan_xs
+8-74 files

OPNSense/plugins 1286db0security/netbird Makefile pkg-descr, security/netbird/src/opnsense/mvc/app/views/OPNsense/Netbird status.volt

security/netbird: sync with master
DeltaFile
+5-10security/netbird/src/opnsense/mvc/app/views/OPNsense/Netbird/status.volt
+1-1security/netbird/Makefile
+1-1security/netbird/pkg-descr
+7-123 files