LLVM/project 2b839f6llvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchMachineFunctionInfo.h, llvm/test/CodeGen/LoongArch musttail.ll tail-calls.ll

[LoongArch] Enable tail calls for sret and byval functions (#168506)

Allow tail calls for functions returning via sret when the caller's sret
pointer can be reused. Also support tail calls for byval arguments.
    
The previous restriction requiring exact match of caller and callee
arguments is relaxed: tail calls are allowed as long as the callee does
not use more stack space than the caller.

Fixes #168152
DeltaFile
+566-0llvm/test/CodeGen/LoongArch/musttail.ll
+79-25llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+14-0llvm/lib/Target/LoongArch/LoongArchMachineFunctionInfo.h
+4-9llvm/test/CodeGen/LoongArch/tail-calls.ll
+663-344 files

LLVM/project 7cbc8a4llvm/unittests/IR MetadataTest.cpp

clang format
DeltaFile
+6-3llvm/unittests/IR/MetadataTest.cpp
+6-31 files

LLVM/project c282cb5llvm/unittests/IR MetadataTest.cpp

Revert "Drop the summation unittest since it's already covered by the gvn lit tests"

This reverts commit fb0d7df21794ab50eaab4cb6e249679089a5a501.
DeltaFile
+26-0llvm/unittests/IR/MetadataTest.cpp
+26-01 files

NetBSD/src TkQMElhlibexec/ld.elf_so xmalloc.c

   Simplify the aligned malloc/free code.
VersionDeltaFile
1.25+11-12libexec/ld.elf_so/xmalloc.c
+11-121 files

LLVM/project d9c523eclang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-0clang/include/clang/Basic/BuiltinsAMDGPU.def
+96-03 files

LLVM/project 4885673llvm/docs AMDGPUUsage.rst

[AMDGPU] Update documentation for wave reduction intrinsics
DeltaFile
+118-2llvm/docs/AMDGPUUsage.rst
+118-21 files

LLVM/project 8e64fd7llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use enum values for src modifiers.
DeltaFile
+8-8llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+8-81 files

LLVM/project a0e0775llvm/lib/Target/RISCV RISCVInstrInfoP.td

[RISCV] Add isCommutable=1 to some binary P extension instructions. (#175692)

This allows MachineCSE to commute these instructions if it would allow
CSE.
DeltaFile
+104-92llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+104-921 files

LLVM/project 9642bc5.github/workflows prune-unused-branches.py

feedback

Created using spr 1.3.7
DeltaFile
+0-4.github/workflows/prune-unused-branches.py
+0-41 files

FreeNAS/freenas 9596b48src/middlewared/middlewared/plugins sysdataset.py

Remove caching from sysdataset plugin

Retrieving the underlying dataset name for the sysdataset path
is now only two syscalls (statx + statmount) instead of reading
the entire /proc/self/mountinfo contents and so this extra caching
actually hurting us now.
DeltaFile
+4-49src/middlewared/middlewared/plugins/sysdataset.py
+4-491 files

FreeNAS/freenas 84878f7src/middlewared/middlewared/plugins sysdataset.py, src/middlewared/middlewared/plugins/system_dataset mount.py

Rework system dataset migration to be less bad

This commit reworks how we migrate the system datasets so that
it's somewhat less racy and uses kernel APIs for this.

On migration:
1. build new mount tree in middleware run dir
2. sync data from old to new
3. move new under old
4. move old to middleware rundir
5. restart services
6. cleanup
DeltaFile
+281-363src/middlewared/middlewared/plugins/sysdataset.py
+76-1src/middlewared/middlewared/utils/mount.py
+67-0src/middlewared/middlewared/plugins/system_dataset/mount.py
+7-1src/middlewared/middlewared/plugins/zfs/mount_events.py
+431-3654 files

LLVM/project fb0d7dfllvm/unittests/IR MetadataTest.cpp

Drop the summation unittest since it's already covered by the gvn lit tests
DeltaFile
+0-26llvm/unittests/IR/MetadataTest.cpp
+0-261 files

FreeBSD/doc 92ff616documentation/content/en/articles/pam _index.adoc

articles/pam: Increment number of control flags

Reviewed by:    ziaee
Pull Request:   https://github.com/freebsd/freebsd-src/pull/558
DeltaFile
+1-1documentation/content/en/articles/pam/_index.adoc
+1-11 files

LLVM/project fccfd89llvm/lib/IR Metadata.cpp

Move the check after merging for calls to simplify the condition
DeltaFile
+3-6llvm/lib/IR/Metadata.cpp
+3-61 files

LLVM/project 1c88701llvm/lib/IR Metadata.cpp, llvm/unittests/IR MetadataTest.cpp

[Metadata][profcheck] Handle identical MDNodes in getMergedProfMetadata

This fixes a bug where !prof metadata was dropped from SelectInsts when GVN simplified/merged them.
Guarded by -profcheck-disable-metadata-fixes. Exposed by the tests in
Transforms/SampleProfile.
DeltaFile
+50-0llvm/unittests/IR/MetadataTest.cpp
+12-0llvm/lib/IR/Metadata.cpp
+62-02 files

LLVM/project e4b8d8alibcxx/include istream, libcxx/test/libcxx/input.output/iostream.format nodiscard.verify.cpp

[libc++][istream] Removed `[[nodiscard]]` from `peek()` (#175591)

Calling `peek()` after constructing a stream is something one can use to
make the stream ignore empty inputs:

```
#include <sstream>

int main() {
  std::istringstream s;
  s.peek();
  while (s && !s.eof()) {
    char c;
    s >> c;
    printf("not eof; read \'%c\' (%d)\n", c, c);
  }
}
```


    [2 lines not shown]
DeltaFile
+0-3libcxx/test/libcxx/input.output/iostream.format/nodiscard.verify.cpp
+1-1libcxx/include/istream
+1-42 files

LLVM/project 070b3e9llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll ran-out-of-sgprs-allocation-failure.mir

[InlineSpiller][AMDGPU] Implement subreg reload during RA spill

Currently, when a virtual register is partially used, the
entire tuple is restored from the spilled location, even if
only a subset of its sub-registers is needed. This patch
introduces support for partial reloads by analyzing actual
register usage and restoring only the required sub-registers.
This improvement enhances register allocation efficiency,
particularly for cases involving tuple virtual registers.
For AMDGPU, this change brings considerable improvements
in workloads that involve matrix operations, large vectors,
and complex control flows.
DeltaFile
+3,429-4,107llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+81-102llvm/test/CodeGen/AMDGPU/ran-out-of-sgprs-allocation-failure.mir
+91-0llvm/test/CodeGen/AMDGPU/skip-partial-reload-for-16bit-regaccess.mir
+35-56llvm/test/CodeGen/AMDGPU/identical-subrange-spill-infloop.ll
+40-40llvm/test/CodeGen/AMDGPU/ra-inserted-scalar-instructions.mir
+26-52llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+3,702-4,35720 files not shown
+3,964-4,52426 files

LLVM/project 12d1aa0llvm/test/CodeGen/AMDGPU regpressure-mitigation-with-subreg-reload.mir

compacted the virt-reg numbers
DeltaFile
+14-14llvm/test/CodeGen/AMDGPU/regpressure-mitigation-with-subreg-reload.mir
+14-141 files

LLVM/project c74c50ellvm/lib/Target/AMDGPU SIRegisterInfo.cpp SIRegisterInfo.h

[AMDGPU] Put back ProperlyAlighedRC helper functions

Putting back the functions that are recently deleted
as they were found unused. They are needed for
implementing subreg reload during RA.
DeltaFile
+22-0llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+5-0llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+27-02 files

LLVM/project 74adfafllvm/test/CodeGen/AMDGPU regpressure-mitigation-with-subreg-reload.mir

[AMDGPU] Test precommit for subreg reload

This test currently fails due to insufficient
registers during allocation. Once the subreg
reload is implemented, it will begin to pass
as the partial reload help mitigate register
pressure.
DeltaFile
+37-0llvm/test/CodeGen/AMDGPU/regpressure-mitigation-with-subreg-reload.mir
+37-01 files

LLVM/project 4343cadllvm/include/llvm/CodeGen LiveRangeEdit.h, llvm/lib/CodeGen LiveRangeEdit.cpp

[CodeGen] Enhance createFrom for sub-reg aware cloning

Instead of just cloning the virtual register, this
function now creates a new virtual register derived
from a subregister class of the original value.
DeltaFile
+9-1llvm/lib/CodeGen/LiveRangeEdit.cpp
+5-2llvm/include/llvm/CodeGen/LiveRangeEdit.h
+14-32 files

LLVM/project a8e5297llvm/lib/Target/AMDGPU AMDGPURewriteAGPRCopyMFMA.cpp

suggestions incorporated.
DeltaFile
+2-2llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+2-21 files

LLVM/project 1687f98llvm/include/llvm/CodeGen TargetRegisterInfo.h, llvm/lib/CodeGen TargetRegisterInfo.cpp

[AMDGPU] Make AMDGPURewriteAGPRCopyMFMA aware of subreg reload

AMDGPURewriteAGPRCopyMFMA pass is currently not subreg-aware.
In particular, the logic that optimizes spills into COPY
instructions assumes full register reloads. This becomes
problematic when the reload instruction partially restores
a tuple register. This patch introduces the necessary changes
to make this pass subreg-aware, for a future patch that
implements subreg reload during RA.
DeltaFile
+41-1llvm/lib/Target/AMDGPU/AMDGPURewriteAGPRCopyMFMA.cpp
+10-0llvm/lib/CodeGen/TargetRegisterInfo.cpp
+3-0llvm/include/llvm/CodeGen/TargetRegisterInfo.h
+54-13 files

LLVM/project 144d8a2llvm/lib/Target/AMDGPU SIInstrInfo.h

fixed a comment.
DeltaFile
+1-1llvm/lib/Target/AMDGPU/SIInstrInfo.h
+1-11 files

LLVM/project bd28c6allvm/include/llvm/IR DebugInfoFlags.def, llvm/test/Assembler debug-info.ll disubprogram.ll

[DebugInfo] Add a new DI flag to record if the name of a template function/type has been simplified (1/3). (#175130)

This flag is used during debug info generation in the LLVM backend to
guide the selective generation of template parameters in the skeleton
CU. As described in [this
RFC](https://discourse.llvm.org/t/rfc-debuginfo-selectively-generate-template-parameters-in-the-skeleton-cu/89395).
DeltaFile
+5-2llvm/test/Assembler/debug-info.ll
+5-2llvm/test/Assembler/disubprogram.ll
+2-1llvm/include/llvm/IR/DebugInfoFlags.def
+12-53 files

LLVM/project bb008e7llvm/utils git-llvm-push

[llvm][utils] Make git-llvm-push set the skip-precommit-approval label (#174833)

skip-precommit-approval label is intended for simple PR that don't
require approval. To reduce the volume of notifications, label all PRs
created using the git-llvm-push script with the skip-precommit-approval
label.

Fixes #174825
DeltaFile
+33-0llvm/utils/git-llvm-push
+33-01 files

LLVM/project 7cae925llvm/lib/Target/AMDGPU SIRegisterInfo.cpp SIInstrInfo.cpp

moved the implementation to SIInstrInfo.
DeltaFile
+1-149llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+146-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+0-2llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+150-1514 files

LLVM/project 2f15df2llvm/lib/Target/AMDGPU SIRegisterInfo.cpp SIRegisterInfo.h

[AMDGPU] Make getNumSubRegsForSpillOp externally available (NFC).
DeltaFile
+3-3llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+2-0llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+5-32 files

LLVM/project 3ae9fa5llvm/test/CodeGen/AMDGPU ran-out-of-sgprs-allocation-failure.mir ra-inserted-scalar-instructions.mir

[AMDGPU] Introduce Offset field in SGPR spill Pseudos

Currently, SGPR spill pseudo-instructions lack
an offset field to represent non-zero stack offsets.
This patch introduces an additional offset field to
SGPR spill pseudo-instructions and updates all
relevant passes that handle spill lowering to support
this new field. This field is essential for a future
patch that implements subreg reload of tuple registers
from their stack location during RA.
DeltaFile
+29-29llvm/test/CodeGen/AMDGPU/ran-out-of-sgprs-allocation-failure.mir
+22-22llvm/test/CodeGen/AMDGPU/ra-inserted-scalar-instructions.mir
+16-16llvm/test/CodeGen/AMDGPU/remat-sop.mir
+14-14llvm/test/CodeGen/AMDGPU/remat-smrd.mir
+9-9llvm/test/CodeGen/AMDGPU/sgpr-spill.mir
+8-8llvm/test/CodeGen/AMDGPU/sgpr-spill-wrong-stack-id.mir
+98-9835 files not shown
+167-16541 files

LLVM/project 1ba3503llvm/lib/Target/AMDGPU SIRegisterInfo.cpp

incorporated review comments.
DeltaFile
+3-3llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
+3-31 files