LLVM/project c324769llvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h, llvm/test/CodeGen/AMDGPU machine-scheduler-rematerialization-scoring.mir machine-scheduler-sink-trivial-remats-attr.mir

Re-apply "[AMDGPU][Scheduler] Scoring system for rematerializations (#175050)"

This re-applies commit f21e3593371c049380f056a539a1601a843df558 along
with the compile fix failure introduced in
8ab79377740789f6a34fc6f04ee321a39ab73724 before the initial patch was
reverted and fixes for the previously observed assert failure.

We were hitting the assert in the HIP Blender due to a combination of
two issues that could happen when rematerializations are being rolled
back.

1. Small changes in slots indices (while preserving instruction order)
   compared to the pre-re-scheduling state meand that we have to
   re-compute live ranges for all register operands of rolled back
   rematerializations. This was not being done before.
2. Re-scheduling can move registers that were rematerialized at
   arbitrary positions in their respective regions while their opcode
   is set to DBG_VALUE, even before their read operands are defined.
   This makes re-scheduling reverts mandatory before rolling back

    [4 lines not shown]
DeltaFile
+507-291llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+523-0llvm/test/CodeGen/AMDGPU/machine-scheduler-rematerialization-scoring.mir
+194-194llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+238-31llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats.mir
+208-51llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+5-5llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-debug.mir
+1,675-5721 files not shown
+1,676-5737 files

LLVM/project 50c01fallvm/lib/CodeGen LiveIntervals.cpp

[CodeGen][NPM] dump slot index info with -debug while running LiveIntervals (#173488)

matches legacy. tests such as "CodeGen/AMDGPU/liveness.mir" and
"CodeGen/AMDGPU/phys-partial-liveness.mir" use this.
DeltaFile
+4-2llvm/lib/CodeGen/LiveIntervals.cpp
+4-21 files

LLVM/project bf9b8dellvm/lib/Target/AMDGPU AMDGPURegBankLegalizeHelper.cpp AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel smul.ll

[AMDGPU][GlobalISel] Add RegBankLegalize support for G_AMDGPU_S_MUL_* (#176842)

DeltaFile
+194-0llvm/test/CodeGen/AMDGPU/GlobalISel/smul.ll
+19-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeHelper.cpp
+5-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+2-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.h
+220-04 files

LLVM/project 75167f1llvm/tools/lto CMakeLists.txt

[Darwin] CMake warning when building sanitized libLTO on Darwin with system sanitizer library (#176976)

Due to a system security policy, libLTO built with `LLVM_USE_SANITIZER`
and a toolchain (i.e. Xcode) sanitizer library cannot be loaded into the
toolchain `ld`. This only affects Darwin.

This adds a warning when users try to do this, and suggests a workaround
(use just-built sanitizer libraries).

This affected the lldb-cmake-sanitized job:
https://github.com/llvm/llvm-zorg/commits/main/zorg/jenkins/jobs/jobs/lldb-cmake-sanitized

rdar://168502870
DeltaFile
+9-0llvm/tools/lto/CMakeLists.txt
+9-01 files

LLVM/project a304968llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel combine-short-clamp.ll

[AMDGPU][GlobalISel] Add RegBankLegalize rules for SMED3 and CVT_PK_I16_I32 (#176596)

These opcodes are created together for the i64->i16 signed clamp
pattern.
DeltaFile
+62-4llvm/test/CodeGen/AMDGPU/GlobalISel/combine-short-clamp.ll
+11-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+73-42 files

LLVM/project ced1c00llvm/lib/Target/AMDGPU AMDGPUTargetMachine.cpp, llvm/test/CodeGen/AMDGPU llc-pipeline-npm.ll

[AMDGPU][NPM] Enable "AMDGPURewriteAGPRCopyMFMAPass" (#173487)

DeltaFile
+420-418llvm/test/CodeGen/AMDGPU/llc-pipeline-npm.ll
+2-0llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
+422-4182 files

LLVM/project e02a55cclang/lib/Basic/Targets WebAssembly.h, clang/test/CodeGenCXX wasm-reftypes-mangle.cpp

[Clang][WebAssembly] Fix crash when using __funcref in C++ code (#176237)

Enable address space map mangling for the WebAssembly target. This fixes
a crash in the Itanium name mangler when trying to mangle types with the
wasm_funcref address space qualifier in C++ mode.

Fixes #176154
DeltaFile
+15-1clang/test/CodeGenCXX/wasm-reftypes-mangle.cpp
+1-0clang/lib/Basic/Targets/WebAssembly.h
+16-12 files

LLVM/project 46ca94allvm/lib/Target/AMDGPU GCNSchedStrategy.cpp GCNSchedStrategy.h, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-debug.mir machine-scheduler-sink-trivial-remats.mir

[AMDGPU][Scheduler] Revert all regions when remat fails to increase occ.

When the rematerialization stage fails to increase occupancy in all
regions, the current implementation only reverts the effect of
re-scheduling in regions in which the increased occupancy target could
not be achieved. However, given that re-scheduling with a higher
occupancy target puts more pressure on the scheduler to achieve lower
maximum RP at the cost of potentially lower ILP as well, region
schedules made with higher occupancy targets are generally less
desirable if the whole function is not able to meet that target.
Therefore, if at least one region cannot reach its target, it makes
sense to revert re-scheduling in all affected regions to go back to
a schedule that was made with a lower occupancy target.

This implements such logic for the rematerialization stage, and adds a
test to showcase that re-scheduling is indeed interrupted/reverted as
soon as a re-scheduled region that does not meet the increased target
occupancy is encountered.


    [5 lines not shown]
DeltaFile
+118-0llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-debug.mir
+58-17llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+15-15llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats.mir
+28-1llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+219-334 files

LLVM/project a579d96llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll

[AMDGPU][Scheduler] Simplify scheduling revert logic

When scheduling must be reverted for a region, the current
implementation re-orders non-debug instructions and debug instructions
separately; the former in a first pass and the latter in a second pass
handled by a generic machine scheduler helper whose state is tied to the
current region being scheduled, in turns limiting the revert logic to
only work on the active scheduling region.

This makes the revert logic work in a single pass for all MIs, and
removes the restriction that it works exclusively on the active
scheduling region. The latter enables future use cases such as
reverting scheduling of multiple regions at once.

While the instruction order produced should be identical to what it was
before, small changes in slot indices of re-scheduled MIs yield
different RA decisions and significant test churn.
DeltaFile
+73,637-73,895llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+11,371-11,469llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+6,062-6,086llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+4,853-4,900llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+3,612-3,687llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+2,602-2,619llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+102,137-102,65632 files not shown
+112,262-112,66038 files

FreeNAS/freenas 63f2a5esrc/middlewared/middlewared/plugins pwenc.py filesystem.py, src/middlewared/middlewared/utils pwenc.py

Improve pwenc handling

Add a common pwenc_rename function that resets caches on
config upload parsing and other places where we replace the
pwenc file. This also ensures that we never have a partially-
written pwenc file (for example sent by remote HA node).

When we rename / replace the pwenc file we'll keep a backup
of the old on so that we have potential to rebuild old
config if needed.
DeltaFile
+71-13src/middlewared/middlewared/utils/pwenc.py
+58-3src/middlewared/middlewared/plugins/pwenc.py
+27-18src/middlewared/middlewared/plugins/filesystem.py
+19-5src/middlewared/middlewared/plugins/config.py
+9-1src/middlewared/middlewared/plugins/failover.py
+184-405 files

FreeBSD/src 98cb487share/man/man7 simd.7

simd.7: Add Arm MOPS memcpy(), memmove() and memset() to manpage

Fixes:                  41ccf82b29f3 (Use MOPS implementations)
Reviewed by:            ziaee
Sponsored by:           Arm Ltd
Differential Revision:  https://reviews.freebsd.org/D54812
DeltaFile
+5-4share/man/man7/simd.7
+5-41 files

FreeBSD/ports 7748e00net-im/iamb distinfo Makefile.crates

net-im/iamb: Update to 0.0.11
DeltaFile
+1,031-759net-im/iamb/distinfo
+514-378net-im/iamb/Makefile.crates
+1-2net-im/iamb/Makefile
+1,546-1,1393 files

LLVM/project 7e01b33llvm/lib/Target/PowerPC PPCInstrAltivec.td PPCISelLowering.cpp, llvm/test/CodeGen/PowerPC vavg.ll

[PPC] Fix suspicious AltiVec VAVG patterns (#176891)

The existing ((X+Y+1)>>1) patterns didn't correct handle overflow, like
the VAVG instructions would

Remove the old patterns and correctly mark the altivec VAVGS/VAVGU
patterns as matching the ISD::AVGCEIL opcodes - the generic DAG folds
will handle everything else

I've updated the vavg.ll tests to correct match ISD::AVGCEILS/U patterns
and added the old tests as negative "overflow" patterns that shouldn't
fold to VAVG instructions

Fixes #174718
DeltaFile
+169-106llvm/test/CodeGen/PowerPC/vavg.ll
+12-12llvm/lib/Target/PowerPC/PPCInstrAltivec.td
+2-0llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+183-1183 files

LLVM/project 26f9624llvm/include/llvm/CodeGen BasicBlockSectionsProfileReader.h, llvm/lib/CodeGen CodeGen.cpp MachineUniformityAnalysis.cpp

[LLVM][CodeGen] Remove pass initialization calls from pass constructors (#173061)

- Remove pass initialization calls from pass constructors.
- For some passes, add the initialization to `initializeCodeGen` or
`initializeGlobalISel`.
- Remove redundant initializations from llc and X86 target for some
passes.
DeltaFile
+17-0llvm/lib/CodeGen/CodeGen.cpp
+3-7llvm/lib/CodeGen/GlobalISel/CSEInfo.cpp
+2-8llvm/include/llvm/CodeGen/BasicBlockSectionsProfileReader.h
+2-7llvm/lib/CodeGen/MachineUniformityAnalysis.cpp
+2-7llvm/lib/CodeGen/MachineBlockPlacement.cpp
+2-7llvm/lib/CodeGen/TailDuplication.cpp
+28-36117 files not shown
+155-379123 files

FreeNAS/freenas 857fbddsrc/middlewared/middlewared/plugins pwenc.py config.py, src/middlewared/middlewared/utils pwenc.py

Flake8 fixes
DeltaFile
+9-16src/middlewared/middlewared/utils/pwenc.py
+3-10src/middlewared/middlewared/plugins/pwenc.py
+6-2src/middlewared/middlewared/plugins/config.py
+1-2src/middlewared/middlewared/plugins/filesystem.py
+19-304 files

LLVM/project f41767dllvm/test/Transforms/LoopVectorize/AArch64 replicating-load-store-costs-apple.ll

[LV] Add replicating load/store cost tests for Apple CPUs.

Add dedicated tests to check replicating load/store costs on Apple CPUs.
DeltaFile
+813-0llvm/test/Transforms/LoopVectorize/AArch64/replicating-load-store-costs-apple.ll
+813-01 files

LLVM/project d836261clang/include/clang/Basic BuiltinsX86_64.td BuiltinsX86.td, clang/lib/AST ExprConstant.cpp

[X86][Clang] allow CRC32 intrinsics to be used in constexp (#173908)

Mostly inspired by https://github.com/llvm/llvm-project/pull/152971

CRC32 implementation using reversed polynomial that does not match an
Intel manual, can be changed to canonical implementation if required (if
there is a canonical implementation we should use, please attach a link)

Closes #168881 
Part of #30794
DeltaFile
+34-0clang/lib/AST/ExprConstant.cpp
+33-0clang/lib/AST/ByteCode/InterpBuiltin.cpp
+10-8clang/lib/Headers/crc32intrin.h
+16-0clang/test/CodeGen/X86/sse42-builtins.c
+1-1clang/include/clang/Basic/BuiltinsX86_64.td
+1-1clang/include/clang/Basic/BuiltinsX86.td
+95-106 files

LLVM/project 9c65795llvm/lib/CodeGen ReachingDefAnalysis.cpp

[ReachingDefAnalysis] Detect entry block correclty. (#176803)

Before a block was deemed an entry block if it did not have any
predecessors, which is wrong: entry block can be a loop header.
DeltaFile
+1-1llvm/lib/CodeGen/ReachingDefAnalysis.cpp
+1-11 files

LLVM/project f0a05b6clang/lib/CIR/CodeGen CIRGenExprAggregate.cpp CIRGenExprScalar.cpp, clang/test/CIR/CodeGen builtin-bit-cast.cpp

Address comments from Andy
DeltaFile
+7-8clang/lib/CIR/CodeGen/CIRGenExprAggregate.cpp
+2-8clang/lib/CIR/CodeGen/CIRGenExprScalar.cpp
+4-0clang/test/CIR/CodeGen/builtin-bit-cast.cpp
+13-163 files

FreeBSD/ports 94eb339security/vuxml/vuln 2026.xml

security/vuxml: document gitlab vulnerabilities
DeltaFile
+37-0security/vuxml/vuln/2026.xml
+37-01 files

LLVM/project 03af5b7llvm/lib/Target/AMDGPU AMDGPU.td, llvm/test/CodeGen/AArch64 nontemporal-store.ll nontemporal.ll

rebase after #177094

Created using spr 1.3.8-beta.1
DeltaFile
+1,104-628llvm/test/CodeGen/RISCV/rvv/setcc-fp.ll
+453-975llvm/test/CodeGen/X86/avgceils.ll
+1,197-0llvm/test/CodeGen/AArch64/nontemporal-store.ll
+293-824llvm/lib/Target/AMDGPU/AMDGPU.td
+259-819llvm/test/CodeGen/X86/avgceilu.ll
+0-798llvm/test/CodeGen/AArch64/nontemporal.ll
+3,306-4,044847 files not shown
+33,702-14,408853 files

LLVM/project b1bef31clang/test/CodeGen/LoongArch/lasx builtin-alias.c builtin.c, llvm/lib/Target/AMDGPU AMDGPU.td

rebase

Created using spr 1.3.7
DeltaFile
+1,104-628llvm/test/CodeGen/RISCV/rvv/setcc-fp.ll
+733-733clang/test/CodeGen/LoongArch/lasx/builtin-alias.c
+733-733clang/test/CodeGen/LoongArch/lasx/builtin.c
+293-824llvm/lib/Target/AMDGPU/AMDGPU.td
+386-305llvm/test/CodeGen/RISCV/rvv/vfmadd-constrained-sdnode.ll
+203-298llvm/test/CodeGen/Mips/msa/f16-llvm-ir.ll
+3,452-3,521588 files not shown
+20,452-10,172594 files

LLVM/project b02d239clang/test/CodeGen/LoongArch/lasx builtin.c builtin-alias.c, llvm/lib/Target/AMDGPU AMDGPU.td

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+1,104-628llvm/test/CodeGen/RISCV/rvv/setcc-fp.ll
+733-733clang/test/CodeGen/LoongArch/lasx/builtin.c
+733-733clang/test/CodeGen/LoongArch/lasx/builtin-alias.c
+293-824llvm/lib/Target/AMDGPU/AMDGPU.td
+386-305llvm/test/CodeGen/RISCV/rvv/vfmadd-constrained-sdnode.ll
+203-298llvm/test/CodeGen/Mips/msa/f16-llvm-ir.ll
+3,452-3,521588 files not shown
+20,452-10,172594 files

FreeBSD/ports 1a79ea7devel/glab distinfo Makefile

devel/glab: update to 1.81.0

Changes:        https://gitlab.com/gitlab-org/cli/-/releases
DeltaFile
+5-5devel/glab/distinfo
+2-3devel/glab/Makefile
+2-0devel/glab/pkg-plist
+9-83 files

LLVM/project e2d7cd6clang/test/CodeGen/LoongArch/lasx builtin.c builtin-alias.c, clang/test/CodeGen/RISCV riscv32-abi.c bitint.c

[IR] Make dead_on_return attribute optionally sized

This patch makes the dead_on_return parameter attribute optionally require a number
of bytes to be passed in to specify the number of bytes known to be dead
upon function return/unwind. This is aimed at enabling annotating the
this pointer in C++ destructors with dead_on_return in clang. We need
this to handle cases like the following:

```
struct X {
  int n;
  ~X() {
    this[n].n = 0;
  }
};
void f() {
  X xs[] = {42, -1};
}
```

    [13 lines not shown]
DeltaFile
+733-733clang/test/CodeGen/LoongArch/lasx/builtin.c
+733-733clang/test/CodeGen/LoongArch/lasx/builtin-alias.c
+37-37clang/test/CodeGen/RISCV/riscv32-abi.c
+38-16llvm/lib/AsmParser/LLParser.cpp
+24-24clang/test/CodeGen/RISCV/bitint.c
+47-0llvm/include/llvm/IR/Attributes.h
+1,612-1,54399 files not shown
+2,014-1,874105 files

LLVM/project 224f2bdllvm/test/MC/RISCV invalid-attribute.s

[RISC-V] Avoid using an allocated extension name in invalid-attributes.s

The Y extension has been allocated and will no longer trigger an error
once https://github.com/llvm/llvm-project/pull/176870 lands. Use 't' to
test this case instead which is still marked as reserved and as far as I
know is not currently reserved for any future extensions.

Pull Request: https://github.com/llvm/llvm-project/pull/177094
DeltaFile
+5-2llvm/test/MC/RISCV/invalid-attribute.s
+5-21 files

FreeNAS/freenas 6595aeesrc/middlewared/middlewared/plugins pwenc.py filesystem.py, src/middlewared/middlewared/utils pwenc.py

Address review

* move the update DB handling to original location
* raise CallError if we filesystem.file_receive called for pwenc
* add dedicated pwenc.replace method that is only valid if
  called by remote node in truenas ha.
DeltaFile
+49-32src/middlewared/middlewared/plugins/pwenc.py
+15-33src/middlewared/middlewared/plugins/filesystem.py
+38-0src/middlewared/middlewared/plugins/config.py
+32-3src/middlewared/middlewared/utils/pwenc.py
+9-1src/middlewared/middlewared/plugins/failover.py
+143-695 files

LLVM/project 2708629clang/lib/CodeGen/Targets AMDGPU.cpp SPIR.cpp, clang/test/CodeGenHIP device-function-huge-arg-ret.hip

[AMDGPU][SPIRV] Correctly lower huge device function arguments (#176921)

In the ABIInfo implementations for both the SPIRV and AMDGPU targets,
the lowering of arguments too large to fit into registers is currently
prone to integer overflows when determining the number of needed
registers for the arguments. This causes arguments so large that they
need more registers than an `unsigned` can represent to look like they
fit into the available registers. To avoid this, the function for
determining the required number of registers is changed to return a
64-bit unsigned integer value instead.

Note that the SPIR-V target currently trips the verifier due to a check
that arguments passed by value don't exceed the representable size. This
also affects other targets, such as x86 and is outside the scope of
these changes.
See https://github.com/llvm/llvm-project/issues/118207.

---------

Signed-off-by: Steffen Holst Larsen <HolstLarsen.Steffen at amd.com>
Co-authored-by: Steffen Holst Larsen <HolstLarsen.Steffen at amd.com>
DeltaFile
+34-0clang/test/CodeGenHIP/device-function-huge-arg-ret.hip
+8-8clang/lib/CodeGen/Targets/AMDGPU.cpp
+8-8clang/lib/CodeGen/Targets/SPIR.cpp
+50-163 files

LLVM/project be6967cllvm/include/llvm/CodeGenTypes MachineValueType.h, llvm/include/llvm/IR DerivedTypes.h Type.h

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+35-0llvm/include/llvm/IR/DerivedTypes.h
+26-5llvm/lib/IR/Type.cpp
+17-10llvm/include/llvm/IR/Type.h
+11-0llvm/include/llvm/CodeGenTypes/MachineValueType.h
+6-0llvm/lib/CodeGen/ValueTypes.cpp
+3-0llvm/lib/IR/AsmWriter.cpp
+98-157 files not shown
+109-1513 files

FreeNAS/freenas 4dc760atests/api2 test_300_nfs.py

Improve resiliency in the tests.
DeltaFile
+169-56tests/api2/test_300_nfs.py
+169-561 files