LLVM/project 540ea54llvm/lib/Transforms/Vectorize VPlanTransforms.cpp, llvm/test/Transforms/LoopVectorize pr128062-interleaved-accesses-narrow-group.ll

Revert "[VPlan] Extend interleave-group-narrowing to WidenCast" (#186072)

This reverts commit bd5f9384 (#183204) to buy us time to investigate a
AArch64 SVE-fixed-length buildbot miscompile.

Ref: https://lab.llvm.org/buildbot/#/builders/143/builds/14601
DeltaFile
+20-20llvm/test/Transforms/LoopVectorize/AArch64/transform-narrow-interleave-to-widen-memory-with-wide-ops-and-casts.ll
+26-3llvm/test/Transforms/LoopVectorize/pr128062-interleaved-accesses-narrow-group.ll
+8-9llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+54-323 files

LLVM/project 69e0768flang/lib/Parser openmp-parsers.cpp

Move deletions to the beginning of the file
DeltaFile
+13-11flang/lib/Parser/openmp-parsers.cpp
+13-111 files

LLVM/project efd20a3llvm/test/CodeGen/AMDGPU maximumnum.ll minimumnum.ll

[AMDGPU] Codegen for min/max instructions for gfx1170 (#185625)

gfx1170 does not have s_minimum/maximum_f16/f32 instructions so a new
feature `SALUMinimumMaximumInsts` is added for gfx12+ subtargets.
DeltaFile
+1,240-0llvm/test/CodeGen/AMDGPU/maximumnum.ll
+1,204-0llvm/test/CodeGen/AMDGPU/minimumnum.ll
+811-0llvm/test/CodeGen/AMDGPU/fminimum3.ll
+811-0llvm/test/CodeGen/AMDGPU/fmaximum3.ll
+678-0llvm/test/CodeGen/AMDGPU/vector-reduce-fmax.ll
+678-0llvm/test/CodeGen/AMDGPU/vector-reduce-fmin.ll
+5,422-024 files not shown
+10,350-31530 files

LLVM/project a372ecalibclc/clc/lib/generic/math clc_maxmag.inc clc_maxmag.cl

libclc: Improve minmag and maxmag (#186092)

Gives slightly better codegen.
DeltaFile
+4-7libclc/clc/lib/generic/math/clc_maxmag.inc
+2-8libclc/clc/lib/generic/math/clc_maxmag.cl
+2-8libclc/clc/lib/generic/math/clc_minmag.cl
+4-6libclc/clc/lib/generic/math/clc_minmag.inc
+12-294 files

NetBSD/pkgsrc oD0ZGXlmath/R Makefile PLIST

   (math/R) Regular practice on PLIST.Darwin, thanks jperkin@
VersionDeltaFile
1.281+1-6math/R/Makefile
1.45+1-2math/R/PLIST
1.15+2-1math/R/PLIST.Darwin
+4-93 files

LLVM/project 5fc04b9flang/test/Semantics/OpenMP resolve07.f90

Use test_symbols.py instead of test_errors.py
DeltaFile
+20-20flang/test/Semantics/OpenMP/resolve07.f90
+20-201 files

LLVM/project 0fb8f7fllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/X86 sdiv_fix_sat.ll udiv_fix_sat.ll

[DAG] Fold away identity FSHL and FSHR patterns (#185667)

Fold away identity FSHL and FSHR patterns

Came up in #185175, this seems to be the cleanest way to get rid of this
pattern

Alive2 proofs:
`fshl(lshr(x, amnt), shl(c, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/AEzthY
`fshl(lshr(x, amnt), fshl(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/oDpaqF
`fshl(lshr(x, amnt), fshr(x, _, amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/aCxQch
`fshl(fshr(_, x, amnt), shl(c, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/89NQME
`fshl(fshr(_, x, amnt), fshl(x, _, BW - amnt), amnt) -> x`:
✓ https://alive2.llvm.org/ce/z/KdR3Mp
`fshl(fshr(_, x, amnt), fshr(x, _, amnt), amnt) -> x`:

    [24 lines not shown]
DeltaFile
+75-86llvm/test/CodeGen/X86/sdiv_fix_sat.ll
+13-27llvm/test/CodeGen/X86/udiv_fix_sat.ll
+11-25llvm/test/CodeGen/X86/load-local-v3i129.ll
+26-0llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+125-1384 files

LLVM/project bb733c6libclc/clc/lib/generic/math clc_maxmag.inc clc_maxmag.cl

libclc: Improve minmag and maxmag

Gives slightly better codegen.
DeltaFile
+4-7libclc/clc/lib/generic/math/clc_maxmag.inc
+2-8libclc/clc/lib/generic/math/clc_maxmag.cl
+2-8libclc/clc/lib/generic/math/clc_minmag.cl
+4-6libclc/clc/lib/generic/math/clc_minmag.inc
+12-294 files

LLVM/project f3752dcclang/lib/CodeGen CGHLSLBuiltins.cpp, clang/lib/Sema SemaHLSL.cpp

[HLSL] Implement Texture2D::Load methods and builtin (#185708)

Implements the Textur2D::Load methods. A new HLSL buildin is added to
implement the method. The HLSL builtin is lowered to the
resource_load_level intrinsic.

We chose to have have a single operand hold both the coordinate and the
level in the builtin, as is done in the Load method itself. This was to
make the external sema source easier. It is easier to split the vector
during codegen than in sema.

Assisted-by: Gemini
DeltaFile
+238-0clang/test/CodeGenHLSL/resources/Texture2D-Load.hlsl
+53-14clang/lib/CodeGen/CGHLSLBuiltins.cpp
+46-0clang/lib/Sema/SemaHLSL.cpp
+43-0clang/test/SemaHLSL/Texture2D-Load-errors.hlsl
+34-0clang/test/AST/HLSL/Texture2D-vector-AST.hlsl
+34-0clang/test/AST/HLSL/Texture2D-scalar-AST.hlsl
+448-147 files not shown
+490-1913 files

LLVM/project 1f66208flang/lib/Semantics resolve-directives.cpp

Update comment
DeltaFile
+1-1flang/lib/Semantics/resolve-directives.cpp
+1-11 files

LLVM/project fc5ca83mlir/include/mlir/Dialect/LLVMIR LLVMOps.td, mlir/lib/Dialect/LLVMIR/IR LLVMDialect.cpp

[mlir][LLVM] Add support for `ptrtoaddr` (#185104)

The `ptrtoaddr` op is akin to `ptrtoint` with some important
differences:
* It does not capture the provenance of the pointer, meaning a pointer
does not escape and subsequent `inttoptr` don't make a legal pointer.
LLVM can then assume the pointer never escaped, which helps alias
analysis.
* It does not support arbitrary integer types, but only exactly the
integer type that is equal in width to the pointer type as specified by
the data layout.

This PR adds the op the MLIR dialect and adds the corresponding
verification for the datalayout property.
DeltaFile
+44-0mlir/include/mlir/Dialect/LLVMIR/LLVMOps.td
+18-0mlir/test/Dialect/LLVMIR/invalid.mlir
+18-0mlir/test/Target/LLVMIR/llvmir.mlir
+16-0mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp
+6-1mlir/test/Target/LLVMIR/Import/instructions.ll
+4-0mlir/test/Dialect/LLVMIR/roundtrip.mlir
+106-16 files

LLVM/project 3a8eabebolt/include/bolt/Passes LongJmp.h, bolt/test/AArch64 compare-and-branch-reorder-blocks.S

[BOLT][AArch64] Support block reordering beyond 1KB for FEAT_CMPBR. (#185443)

Currently LongJmpPass::relaxLocalBranches bails early if the estimated
size of a binary function is less than 32KB assuming that the shortest
branches are 16 bits. Therefore the fixup value for the cold branch
target may go out of range if the function is larger than 1KB.

I am decreasing ShortestJumpSpan from 32KB to 1KB, since FEAT_CMPBR
branches are 11 bits.
DeltaFile
+17-6bolt/test/AArch64/compare-and-branch-reorder-blocks.S
+1-1bolt/include/bolt/Passes/LongJmp.h
+18-72 files

LLVM/project 753db4blibc/utils/hdrgen/tests/expected_output test_small_proxy.h

[libc] Fix hdrgen test test_small_proxy.h (#185890)

The expected output was outdated as it did not contain the macro
definitions.

This patch fixes the issue.
DeltaFile
+6-0libc/utils/hdrgen/tests/expected_output/test_small_proxy.h
+6-01 files

LLVM/project 6fc7b3fllvm/test/CodeGen/X86 2011-12-15-vec_shift.ll

[X86] 2011-12-15-vec_shift.ll - regenerate test checks (#186077)
DeltaFile
+58-10llvm/test/CodeGen/X86/2011-12-15-vec_shift.ll
+58-101 files

LLVM/project 85e542flibclc/clc/lib/generic/math clc_fdim.cl clc_fdim.inc

libclc: Improve fdim handling (#186085)

The maxnum is somewhat overconstraining. This gives slightly
better codegen and avoids the noise from the select and convert,
and saves the cost of materializing the nan literal.
DeltaFile
+2-8libclc/clc/lib/generic/math/clc_fdim.cl
+1-3libclc/clc/lib/generic/math/clc_fdim.inc
+3-112 files

LLVM/project ea86511libclc/clc/include/clc/math clc_nextup.h clc_nextdown.h, libclc/clc/lib/generic/math clc_nextafter.cl clc_nextafter.inc

libclc: Replace nextafter implementation (#186082)

Use a more straightforward version which allows
optimizations to delete the edge case checks, and also
codegens better. Implement in terms of new nextup and nextdown
helper functions, which are IEEE functions, and usable in other
functions.
DeltaFile
+7-72libclc/clc/lib/generic/math/clc_nextafter.cl
+18-0libclc/clc/include/clc/math/clc_nextup.h
+18-0libclc/clc/include/clc/math/clc_nextdown.h
+17-0libclc/clc/lib/generic/math/clc_nextafter.inc
+17-0libclc/clc/lib/generic/math/clc_nextdown.inc
+15-0libclc/clc/lib/generic/math/clc_nextdown.cl
+92-725 files not shown
+132-7211 files

LLVM/project 3c7f70blibclc/clc/lib/generic/math clc_fmod.cl

libclc: Replace fmod implementation with elementwise builtin (#186083)

This corresponds to frem, which for whatever reason is a first
class IR instruction. The backend has a heroic freestanding
implementation that should be nearly identical to what was here.
DeltaFile
+3-180libclc/clc/lib/generic/math/clc_fmod.cl
+3-1801 files

LLVM/project a5c35cblibclc/clc/include/clc/math clc_nextup.h clc_nextdown.h, libclc/clc/lib/generic/math clc_nextafter.cl clc_nextdown.inc

libclc: Replace nextafter implementation

Use a more straightforward version which allows
optimizations to delete the edge case checks, and also
codegens better. Implement in terms of new nextup and nextdown
helper functions, which are IEEE functions, and usable in other
functions.
DeltaFile
+7-72libclc/clc/lib/generic/math/clc_nextafter.cl
+18-0libclc/clc/include/clc/math/clc_nextup.h
+18-0libclc/clc/include/clc/math/clc_nextdown.h
+17-0libclc/clc/lib/generic/math/clc_nextdown.inc
+17-0libclc/clc/lib/generic/math/clc_nextafter.inc
+15-0libclc/clc/lib/generic/math/clc_nextdown.cl
+92-725 files not shown
+132-7211 files

FreeBSD/ports 213b265math/R-cran-units distinfo Makefile

math/R-cran-units: Update to 1.0-1

ChangeLog: https://cran.r-project.org/web/packages/units/news/news.html
DeltaFile
+3-3math/R-cran-units/distinfo
+2-2math/R-cran-units/Makefile
+5-52 files

NetBSD/pkgsrc LcHTey5doc CHANGES-2026

   Updated devel/py-scikit-build, graphics/py-imageio
VersionDeltaFile
1.1703+3-1doc/CHANGES-2026
+3-11 files

NetBSD/pkgsrc 2UzH2gigraphics/py-imageio distinfo Makefile

   py-imageio: updated to 2.37.3

   2.37.3

   Bug

   Update dependencies

   Maint

   Bump psf/black (dev dependency) to fix security vulnerability
VersionDeltaFile
1.18+4-4graphics/py-imageio/distinfo
1.28+2-2graphics/py-imageio/Makefile
+6-62 files

LLVM/project 11ac230libclc/clc/lib/generic/math clc_fdim.cl clc_fdim.inc

libclc: Improve fdim handling

The maxnum is somewhat overconstraining. This gives slightly
better codegen and avoids the noise from the select and convert,
and saves the cost of materializing the nan literal.
DeltaFile
+2-8libclc/clc/lib/generic/math/clc_fdim.cl
+1-3libclc/clc/lib/generic/math/clc_fdim.inc
+3-112 files

NetBSD/pkgsrc TeOP5mNdevel/py-scikit-build distinfo Makefile, devel/py-scikit-build/patches patch-skbuild_platform__specifics_platform__factory.py

   py-scikit-build: updated to 0.19.0

   0.19.0

   This release updates for changes in setuptools and CMake 4, and drops Python 3.7.

   Features

   * Drop Python 3.7 in :pr:`1134`

   Bug fixes

   * Update for newer setuptools in :pr:`1120`
   * ``setuptools_wrap.py``: parse ``CMAKE_ARGS`` with ``shlex.split`` like elsewhere by :user:`haampie` in :pr:`1126`
   * Drop ``dry-run`` (removed in setuptools) in :pr:`1166`
   * Ensure generic f2py executable is looked up first by :user:`smiet` in :pr:`1111`

   Testing


    [6 lines not shown]
VersionDeltaFile
1.8+5-5devel/py-scikit-build/distinfo
1.9+3-2devel/py-scikit-build/Makefile
1.4+2-2devel/py-scikit-build/patches/patch-skbuild_platform__specifics_platform__factory.py
+10-93 files

LLVM/project c7dccd5clang/test/CodeGen arm-bf16-getset-intrinsics.c, clang/test/CodeGen/AArch64 bf16-getset-intrinsics.c

[Clang][AArch64] Remove duplicate CodeGen test for bf16 get/set intrinsics

The following test files contain identical test bodies (aside from the
RUN lines):

  * clang/test/CodeGen/AArch64/bf16-getset-intrinsics.c
  * clang/test/CodeGen/arm-bf16-getset-intrinsics.c

The differences in the RUN lines do not appear to be relevant for the
tested functionality. This change keeps a single test file and
simplifies its RUN lines to match the generic style used in
clang/test/CodeGen/AArch64/neon.

This also moves toward unifying and reusing RUN lines across tests.
DeltaFile
+0-175clang/test/CodeGen/arm-bf16-getset-intrinsics.c
+1-2clang/test/CodeGen/AArch64/bf16-getset-intrinsics.c
+1-1772 files

LLVM/project ab6bb1bcompiler-rt/lib/builtins/arm addsf3.S, compiler-rt/test/builtins/Unit addsf3_test.c

compiler-rt/arm: Check for overflow when adding float denorms (#185245)

When the sum of two sub-normal values is not also subnormal, we need to
set the exponent to one.

Test case:

static volatile float x = 0x1.362b4p-127;
static volatile float x2 = 0x1.362b4p-127 * 2;

int
main (void)
{
        printf("x %a x2 %a x + x %a\n", x, x2, x + x);
        return x2 == x + x ? 0 : 1;
}

Signed-off-by: Keith Packard <keithp at keithp.com>
DeltaFile
+96-0compiler-rt/test/builtins/Unit/addsf3_test.c
+9-0compiler-rt/lib/builtins/arm/addsf3.S
+105-02 files

FreeBSD/ports af6c87edevel/R-cran-listenv distinfo Makefile

devel/R-cran-listenv: Update to 0.10.1

ChangeLog: https://cran.r-project.org/web/packages/listenv/news/news.html
DeltaFile
+3-3devel/R-cran-listenv/distinfo
+1-1devel/R-cran-listenv/Makefile
+4-42 files

LLVM/project 9a8147bllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/test/CodeGen/AArch64 neon-abd.ll

Revert "[SDAG] (abs (add nsw a, -b)) -> (abds a, b)" (#17580) (#186068)

Reverts llvm/llvm-project#175801 while #185467 miscompilation is being investigated
DeltaFile
+0-54llvm/test/CodeGen/AArch64/neon-abd.ll
+3-36llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+3-902 files

LLVM/project c611f7dlibclc/clc/lib/generic/math clc_fmod.cl

libclc: Replace fmod implementation with elementwise builtin

This corresponds to frem, which for whatever reason is a first
class IR instruction. The backend has a heroic freestanding
implementation that should be nearly identical to what was here.
DeltaFile
+3-180libclc/clc/lib/generic/math/clc_fmod.cl
+3-1801 files

NetBSD/pkgsrc-wip 1312aafpy-copyparty TODO

py-copyparty: Add references to CVE-2026-3210[89]
DeltaFile
+1-1py-copyparty/TODO
+1-11 files

OpenBSD/src jA2kj7tsys/dev/pci if_bnxt.c

   Add support for BCM575xx devices, variously known as Thor or P5.
   There are a few significant differences to earlier devices.

   The nic now requires some host memory to use as backing store for its queues,
   and for now we're overallocating to some extent.  It's not a noticeable amount
   of memory for a system with one of these nics in it, so this isn't a huge
   concern.

   P5 devices have notification queues to act as an indirection between tx/rx
   completion rings and msi-x vectors.  We set up one per queue and statically
   map them to msi-x vectors in turn according to the intrmap.

   The doorbell structures are now 64 bits, and all written to through the same
   memory address.

   Ring groups are not used, so the functions to allocate and free ring groups
   don't do anything for P5 devices; instead, rings are directly associated
   with each other on creation, and aggregation rings are identified by a
   different ring type.

    [3 lines not shown]
VersionDeltaFile
1.68+741-97sys/dev/pci/if_bnxt.c
+741-971 files