[MLIR] [Python] Relaxed `list` to `Sequence` in most parameter types (#188543)
Using `Sequence` frees users from the need to cast to `list` in cases
where the underlying API does not really care about the type of the
container.
Note that accepting an `nb::sequence` is marginally slower than
accepting `nb::list` directly, because `__getitem__`, `__len__` etc need
to go through an extra layer of indirection. However, I expect the
performance difference to be negligible.
libclc: Update erf
This was originally ported from rocm device libs in
c374cb76f467f01a3f60740703f995a0e1f7a89a. Merge in more
recent changes. Also enables vectorization.
[AMDGPU][True16] Generate correct reg size for reg_sequence16 in wmma src mod select (#187629)
When a f16 from a true16 insts is passed to a wmma, the src mod try to
pack it to a v4f16 using v_perm_b32. In true16 mode this is causing an
issue since v_perm_b32 takes vgpr32. Create a vgpr_32 for 16-bit src
before passing to v_perm_b32 in true16 mode so that the reg size
matched.
Ideailly we should use reg_sequence to replace v_perm_b32 in true16
mode. However, it currently hit a problem with bad code quality. With
current optimization it only shows better code quality when .hi16 is
selected in vector shuffle. Will fix it when reg allocator and coalescer
can reduce the extra mov
doc: Fix warnings pkglint points out about CHANGES-2026
To the best of my understanding:
- sqlite3 3.52.0 was withdrawn so the changes were a downgrade,
- remove duplicate lines
- correct committer
- fix some malformed lines / version numbers
[Flang-RT] Support building no library (#187868)
Allow setting both FLANG_RT_ENABLE_SHARED and FLANG_RT_ENABLE_STATIC to
OFF at the same time.
This is extracted out of #171515 to make that PR a little smaller. By
itself it makes little sense since if not building either the `.a` or
the `.so`, you are not building anything. But with #171515, the module
files are still built, allowing building the modules files without the
library. This is mostly intended for GPGPU targets where building the
library is not always needed, but the module files are.
NAS-140367 / 26.0.0-BETA.2 / Allow the trains to be marked as unstable to make "the system can be upgraded only to the next train" constraint feasible (by themylogin) (by bugclerk) (#18563)
Original PR: https://github.com/truenas/middleware/pull/18541
---------
Co-authored-by: themylogin <themylogin at gmail.com>
NAS-140367 / 27.0.0-BETA.1 / Allow the trains to be marked as unstable to make "the system can be upgraded only to the next train" constraint feasible (by themylogin) (by bugclerk) (#18564)
Original PR: https://github.com/truenas/middleware/pull/18541
---------
Co-authored-by: themylogin <themylogin at gmail.com>
[HLSL][DXIL][SPIRV] QuadReadAcrossY intrinsic support (#187440)
This PR adds QuadReadAcrossY intrinsic support in HLSL with codegen for
both DirectX and SPIRV backends. Resolves
https://github.com/llvm/llvm-project/issues/99176.
- [x] Implement `QuadReadAcrossY` clang builtin,
- [x] Link `QuadReadAcrossY` clang builtin with `hlsl_intrinsics.h`
- [x] Add sema checks for `QuadReadAcrossY` to
`CheckHLSLBuiltinFunctionCall` in `SemaChecking.cpp`
- [x] Add codegen for `QuadReadAcrossY` to `EmitHLSLBuiltinExpr` in
`CGBuiltin.cpp`
- [x] Add codegen tests to
`clang/test/CodeGenHLSL/builtins/QuadReadAcrossY.hlsl`
- [x] Add sema tests to
`clang/test/SemaHLSL/BuiltIns/QuadReadAcrossY-errors.hlsl`
- [x] Create the `int_dx_QuadReadAcrossY` intrinsic in
`IntrinsicsDirectX.td`
- [x] Create the `DXILOpMapping` of `int_dx_QuadReadAcrossY` to `123` in
[9 lines not shown]
[mlir][linalg] Fix crash in greedy fusion when producer is fused into multiple consumers (#188561)
When the same producer is fused into multiple consumers in
fuseLinalgOpsGreedily, the second fusion can't find the original op in
the linalgOps vector (already replaced by the first fusion). llvm::find
returns end(), and writing to *end() caused an out-of-bounds stack write
that corrupted the adjacent OpBuilder's context pointer, leading to a
crash.
Fix by checking that find() returned a valid iterator before updating.
Fixes #122247
Assisted-by: Claude Code
cad/route-rnd: [NEW PORT] Flexible, modular autorouter for Printed Circuit Boards
route-rnd is a Free Software flexible, modular autorouter for Printed Circuit
Boards
- modular, supports different routing algorithms
- fits well in a UNIXy workflow
- the designed-for-simplicity file format makes it easy to interface
- fully CLI, no GUI required
- active development, frequent releases
- Free Software license (GNU GPLv2+)
WWW: http://www.repo.hu/projects/route-rnd/
Approved by: db@, yuri@ (Mentors, implicit)
[MLIR] [Mem2Reg] Quick fix for dominance info invalidation (#188518)
We have identified a problem with DominanceInfo caching in Mem2Reg. It
appears to also be subject to incorrect cache hits when regions are
deleted, causing sporadic bugs that are difficult to test for.
This quick fix invalidates region that could be invalidated. This
attempts to not be too eager by only invalidating regions that are
exposed to a `finalizePromotion` call.
Ultimately it would be nice to have the ability to move the cached
information from one region to the next, but this is currently not
supported by DominanceInfo.
I was not able to produce a test for this as it is very sporadic. We
would need to be testing for a case where a region is re-allocated at
the same address as a previously erased region. If you know how to make
this sort of behavior consistent, I would be interested. Otherwise this
might require no testing.