LLVM/project 74e588bbolt/include/bolt/Profile DataAggregator.h DataReader.h, bolt/lib/Profile DataAggregator.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+160-132bolt/lib/Profile/DataAggregator.cpp
+24-13bolt/include/bolt/Profile/DataAggregator.h
+4-2bolt/include/bolt/Profile/DataReader.h
+3-0bolt/test/X86/pre-aggregated-perf.test
+191-1474 files

LLVM/project 2f1ef5cbolt/include/bolt/Profile DataAggregator.h, bolt/lib/Profile DataAggregator.cpp

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+124-120bolt/lib/Profile/DataAggregator.cpp
+13-10bolt/include/bolt/Profile/DataAggregator.h
+2-0bolt/test/X86/pre-aggregated-perf.test
+139-1303 files

LLVM/project d59fbcellvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

address comments

Created using spr 1.3.8-beta.1
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,287-12,385llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,090-164,3807,534 files not shown
+633,695-449,9077,540 files

LLVM/project 90765dallvm/test/CodeGen/RISCV clmul.ll clmulr.ll, llvm/test/CodeGen/RISCV/rvv clmulh-sdnode.ll clmul-sdnode.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+38,494-84,026llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+22,388-22,086llvm/test/CodeGen/RISCV/rvv/clmul-sdnode.ll
+19,087-24,391llvm/test/CodeGen/RISCV/clmul.ll
+10,473-12,572llvm/test/CodeGen/RISCV/clmulr.ll
+10,287-12,385llvm/test/CodeGen/RISCV/clmulh.ll
+8,361-8,920llvm/test/CodeGen/RISCV/rvv/expandload.ll
+109,090-164,3807,533 files not shown
+633,598-449,8047,539 files

LLVM/project 44ab570llvm/lib/CodeGen MachineInstrBundle.cpp, llvm/test/CodeGen/AMDGPU finalizebundle.mir hard-clauses-gfx1250.mir

[CodeGen] Treat Reg uses which are partailly defined within bundle as internal read
DeltaFile
+76-0llvm/test/CodeGen/AMDGPU/finalizebundle.mir
+43-9llvm/lib/CodeGen/MachineInstrBundle.cpp
+42-0llvm/test/CodeGen/AMDGPU/hard-clauses-gfx1250.mir
+161-93 files

LLVM/project 636740fllvm/test/CodeGen/AMDGPU si-insert-hard-clause-bundle-fail.ll

add test
DeltaFile
+59-0llvm/test/CodeGen/AMDGPU/si-insert-hard-clause-bundle-fail.ll
+59-01 files

LLVM/project 7b479aaflang/test/Lower/OpenACC acc-loop-exit.f90

Re-introduce removed test
DeltaFile
+41-0flang/test/Lower/OpenACC/acc-loop-exit.f90
+41-01 files

LLVM/project efb038fflang/lib/Lower/OpenMP OpenMP.cpp, flang/test/Lower/OpenMP nothing.f90

[Flang][Openmp]Prevent TODO abort on nothing directive (#202679)

Since nothing is a no-op directive (OpenMP 5.2, 8.4), handle it during
lowering instead of falling through to the generic unimplemented
utility-directive path and triggering a TODO abort.
DeltaFile
+12-3flang/lib/Lower/OpenMP/OpenMP.cpp
+8-0flang/test/Lower/OpenMP/nothing.f90
+1-1flang/test/Lower/OpenMP/Todo/error.f90
+21-43 files

LLVM/project 452f59cllvm/lib/Transforms/Utils LoopUnroll.cpp, llvm/test/Transforms/LoopUnroll runtime-unroll-reductions-min-max.ll

Reapply "[LoopUnroll] Support parallel reductions for minmax" (#201010)

Reapplies 1e79ea1f5b3e (#182473) reverted by 56ccbc253150 (#200892). The
revert was due to a profcheck failure: prof-verify reported "select
annotation missing" on the combine select createMinMaxOp emits for FP
fcmp+select min/max.

This patch fixes it by marking the branch weights of newly inserted
selects as explicitly unknown.
DeltaFile
+906-0llvm/test/Transforms/LoopUnroll/runtime-unroll-reductions-min-max.ll
+8-7llvm/lib/Transforms/Utils/LoopUnroll.cpp
+914-72 files

LLVM/project 89a5c69clang/lib/CodeGen/Targets RISCV.cpp

[RISCV] Return the type from detectVLSCCEligibleStruct instead of using an output argument. NFC (#203423)

We can replace the previous bool return with the type and use nullptr for
false.
DeltaFile
+13-17clang/lib/CodeGen/Targets/RISCV.cpp
+13-171 files

LLVM/project 8679ab6libc/test/src/math RoundToIntegerTest.h

[libc] [math] Fix build bot failure introduced by unit test in PR #201154 (#203457)

The root cause is that the unit test
`libc/test/src/math/RoundToIntegerTest.h` `#include <cfenv>` which
requires the macro `__GLIBC_PREREQ` to be defined. But in that riscv32
runtime, seems like it's not defined.

Removing the include works fine, and at the same time, would resolve the
failure.
DeltaFile
+0-1libc/test/src/math/RoundToIntegerTest.h
+0-11 files

LLVM/project f3f7317llvm/lib/Target/AMDGPU VOP1Instructions.td, llvm/test/CodeGen/AMDGPU bf16-math.ll

[AMDGPU] Add MC clamp support for bf16 trans instructions (#203433)

Based on recent gfx1250 sp3 update. Refer to DEGFXSP3-664
DeltaFile
+0-40llvm/test/MC/AMDGPU/gfx1250_asm_vop1_err.s
+32-0llvm/test/MC/Disassembler/AMDGPU/gfx1250_dasm_vop3_from_vop1.txt
+24-0llvm/test/MC/AMDGPU/gfx1250_asm_vop3_from_vop1-fake16.s
+24-0llvm/test/MC/AMDGPU/gfx1250_asm_vop3_from_vop1.s
+1-4llvm/test/CodeGen/AMDGPU/bf16-math.ll
+1-1llvm/lib/Target/AMDGPU/VOP1Instructions.td
+82-456 files

LLVM/project c1991damlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Update nvvm.barrier.arrive Op (#202608)

This change updates the `nvvm.barrier.arrive` Op to lower using
intrinsics instead of inline PTX. It also adds a new `aligned` attribute
to the Op to lower to both aligned and unaligned forms.

PTX Spec Reference:
https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-bar
DeltaFile
+18-0mlir/test/Target/LLVMIR/nvvm/barrier.mlir
+7-11mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+14-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+0-13mlir/test/Conversion/NVVMToLLVM/nvvm-to-llvm.mlir
+39-244 files

LLVM/project 62847abclang/include/clang/CIR/Dialect/IR CIROps.td, clang/lib/CIR/Lowering/DirectToLLVM LowerToLLVM.cpp

[CIR] Support zero/one result trivial operation lower via tablegen (#203183)

### summary

Lower zero result operation have been supported in this PR:
https://github.com/llvm/llvm-project/pull/202273

In this PR, the lowering of operations with zero-result and one-result
is changed to be automatically lowered via TableGen. This helps reduce
the size of the file
`clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp`

#### test

I thought existing lower tests can cover this PR, so I didn't add more
tests.


Assisted-by: Claude Opus 4.8
DeltaFile
+0-82clang/lib/CIR/Lowering/DirectToLLVM/LowerToLLVM.cpp
+20-0clang/include/clang/CIR/Dialect/IR/CIROps.td
+20-822 files

LLVM/project 2a6cfc5compiler-rt/test/fuzzer merge-posix.test features_dir.test

[Fuzzer] Make two tests compatible with the internal shell. (#203448)

1. Remove redundant parntheses that broke the internal shell's parsing
logic.
2. Use env when specifying environment variables.
3. Rewrite a bash one-line loop in python.
DeltaFile
+2-2compiler-rt/test/fuzzer/merge-posix.test
+1-1compiler-rt/test/fuzzer/features_dir.test
+3-32 files

LLVM/project 3397c37llvm/lib/Target/RISCV RISCVFrameLowering.cpp, llvm/test/CodeGen/RISCV stack-probing-dynamic-nonentry.ll

Inline stack probes immediately after `allocateStack` in `eliminateCallFramePseudoInstr` (#195456)

[ Upstream commit 589faedadf141e5e63f7a1e92a0327fc9bdc9b09 ]

Revert `bltu` in probing loops to `blt` because commit
f162be248636046a20e71209e139347e084b637a isn't applied on release/22.x
yet.

Link: https://github.com/llvm/llvm-project/pull/192485 ("[RISCV] Use
 unsigned comparison for stack clash probing loop")

---

This PR adds a call to `inlineStackProbe` immediately after
`allocateStack` in `eliminateCallFramePseudoInstr`. This allows code
generation for stack probe pseudoinstructions in non-entry BBs.

Fixes #195454.
DeltaFile
+115-0llvm/test/CodeGen/RISCV/stack-probing-dynamic-nonentry.ll
+1-0llvm/lib/Target/RISCV/RISCVFrameLowering.cpp
+116-02 files

LLVM/project beb2614utils/bazel/llvm-project-overlay/libc BUILD.bazel

[bazel][libc] Fix 582643f1ec62d0c81d97afcf1b741babb3152728 (#203449)

Add dep for dyadic float -> attributes
DeltaFile
+1-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+1-01 files

LLVM/project 0591eefllvm/lib/Transforms/InstCombine InstCombineCalls.cpp, llvm/test/Transforms/InstCombine assume.ll

[InstCombine] Move noundef assume bundles on loads into metadata (#203395)
DeltaFile
+24-0llvm/test/Transforms/InstCombine/assume.ll
+9-0llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+33-02 files

LLVM/project 8bba386llvm/test/CodeGen/AMDGPU amdgpu-inline.ll alloca.ll, llvm/test/Transforms/OpenMP custom_state_machines_pre_lto.ll spmdization_kernel_env_dep.ll

AMDGPU/Tests: Remove redundant explicit data layouts from AMDGPU tests

These all look like either cargo culting of outdated requirements or
test cases that were not fully reduced. Since the data layout evolves
over time with new address spaces being added, it seems good practice to
avoid hard-coding it in tests that don't need it.

commit-id:1f845f5e
DeltaFile
+5-5llvm/test/CodeGen/AMDGPU/amdgpu-inline.ll
+4-4llvm/test/CodeGen/AMDGPU/alloca.ll
+3-3llvm/test/Transforms/OpenMP/custom_state_machines_pre_lto.ll
+2-2llvm/test/CodeGen/AMDGPU/amdgpu.private-memory.ll
+2-2llvm/test/CodeGen/AMDGPU/amdgpu-alias-analysis.ll
+1-2llvm/test/Transforms/OpenMP/spmdization_kernel_env_dep.ll
+17-1824 files not shown
+25-4230 files

LLVM/project 69371e6utils/bazel/llvm-project-overlay/libc BUILD.bazel

[bazel][libc] Fix 8acfc364e9f788367ff0beab5c76a3527a689a0b (#203443)

Add extra htons yaml deps
DeltaFile
+8-2utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+8-21 files

LLVM/project bb87842clang/test/Format lit.local.cfg

clang-format/test: Anchor the empty .clang-format-ignore to test_exec_root

The test suite's lit.local.cfg creates an empty .clang-format-ignore at
config discovery time to protect the multiple-inputs[-inplace].cpp tests
that work on files in temporary locations.

This file should be written to where the tests execute instead of the
CWD during config discovery. The CWD might not even be an ancestor of
where the tests execute, and it might be the repository root which does
have a .clang-format-ignore that is incorrectly clobbered without this
change.

An alternative would be to just fix the tests that need to be protected,
but having a blanket guard like this does seem like a reasonable thing
to do.

Fixes: 915de1a5889c ("Generate empty .clang-format-ignore before running tests (#136154)")
commit-id:fe858dac
DeltaFile
+3-3clang/test/Format/lit.local.cfg
+3-31 files

LLVM/project 56f8fbbclang/include/clang/Basic BuiltinsRISCV.td, clang/lib/CodeGen/TargetBuiltins RISCV.cpp

[RISCV][P-ext] Support Packed Averaging Addition and Subtraction intrinsics(#203147)
DeltaFile
+444-0clang/test/CodeGen/RISCV/rvp-intrinsics.c
+168-0llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+132-0cross-project-tests/intrinsic-header-tests/riscv_packed_simd.c
+72-0llvm/test/CodeGen/RISCV/rvp-simd-32.ll
+58-0clang/lib/CodeGen/TargetBuiltins/RISCV.cpp
+30-0clang/include/clang/Basic/BuiltinsRISCV.td
+904-04 files not shown
+992-010 files

LLVM/project 15fdc79utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel][DirectX] Fix 2bccbf23edddf216ef060d34443f60f644d0fb06 (#203442)

Add new dep on MC
DeltaFile
+1-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+1-01 files

LLVM/project b9704dellvm/lib/Passes PassBuilder.cpp

[PassBuilder] Table-drive pass name printing (#202656)

Replace the macro-expanded raw_ostream operations in
PassBuilder::printPassNames with static pass-name arrays and two shared
noinline printing loops. Preserve the generated category order and the
exact spelling of parameterized pass names.

The change only executes when a client requests the pass-name listing;
normal pipeline parsing and optimization do not access the new tables or
helpers.

A stripped opt binary shrinks from 115,493,720 to 115,394,640 bytes,
saving 99,080 bytes. The linked __TEXT section shrinks by 98,304 bytes.

Work towards #202616

AI tool disclosure: Co-authored with OpenAI Codex.
DeltaFile
+91-23llvm/lib/Passes/PassBuilder.cpp
+91-231 files

LLVM/project 8433cf6llvm/lib/Target/AMDGPU SIInstructions.td AMDGPULegalizerInfo.cpp, llvm/test/CodeGen/AMDGPU packed-fp64.ll

[AMDGPU] Make v2f64 fneg legal on gfx1251 (#203427)
DeltaFile
+48-0llvm/test/CodeGen/AMDGPU/packed-fp64.ll
+21-0llvm/lib/Target/AMDGPU/SIInstructions.td
+16-1llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+7-0llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+3-2llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+1-1llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+96-46 files

LLVM/project 7dcd1d2clang/lib/ScalableStaticAnalysisFramework/Analyses SSAFAnalysesCommon.h, clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow PointerFlowExtractor.cpp

Revert "[SSAF][Extractor] Make hard errors in PointerFlow and UnsafeBufferUsage Extractors quiet (#201953)" (#203432)

This reverts commit 9f1e08fa8ed7bcf4b7cfaf9eaaa7c23a2d3ed347.

It causes build error:
https://lab.llvm.org/buildbot/#/builders/2/builds/53597.
The use of 'setCurrentDebugType' should be guarded by '#ifndef NDEBUG'
DeltaFile
+2-43clang/unittests/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowTest.cpp
+12-15clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowExtractor.cpp
+0-26clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+11-11clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageExtractor.cpp
+0-9clang/lib/ScalableStaticAnalysisFramework/Analyses/SSAFAnalysesCommon.h
+25-1045 files

LLVM/project 9574637clang/lib/Headers riscv_packed_simd.h, clang/test/CodeGen/RISCV rvp-intrinsics.c

[Clang][RISCV] packed comparison intrinsics (#203191)

Add header wrappers for pmseq/pmsne/pmslt[u]/pmsgt[u]/pmsge[u]/pmsle[u]
as element-wise vector comparisons cast to the unsigned result type.
DeltaFile
+1,444-0clang/test/CodeGen/RISCV/rvp-intrinsics.c
+444-0cross-project-tests/intrinsic-header-tests/riscv_packed_simd.c
+71-0clang/lib/Headers/riscv_packed_simd.h
+1,959-03 files

LLVM/project d543c49clang/lib/Driver/ToolChains Darwin.cpp, clang/test/Driver darwin-objc-selector-stubs.m

[clang][Darwin] Disable ObjC class selector stubs when using LLD (#203388)

LLD does not support ObjC class selector stubs yet (which requires
synthesizing `objc_msgSendClass$...` stubs). This change disables
`-fobjc-msgsend-class-selector-stubs` by default when the linker is LLD.
Ref: https://github.com/llvm/llvm-project/issues/203385
DeltaFile
+5-1clang/lib/Driver/ToolChains/Darwin.cpp
+2-1clang/test/Driver/darwin-objc-selector-stubs.m
+7-22 files

LLVM/project 6aeb74bclang/lib/ScalableStaticAnalysisFramework/Analyses SSAFAnalysesCommon.h, clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow PointerFlowExtractor.cpp

Revert "[SSAF][Extractor] Make hard errors in PointerFlow and UnsafeBufferUsage Extractors quiet (#201953)"

This reverts commit 9f1e08fa8ed7bcf4b7cfaf9eaaa7c23a2d3ed347.

It causes build error: https://lab.llvm.org/buildbot/#/builders/2/builds/53597
The use of 'setCurrentDebugType' should be guarded by '#ifndef NDEBUG'
DeltaFile
+2-43clang/unittests/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowTest.cpp
+12-15clang/lib/ScalableStaticAnalysisFramework/Analyses/PointerFlow/PointerFlowExtractor.cpp
+0-26clang/unittests/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageTest.cpp
+11-11clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsageExtractor.cpp
+0-9clang/lib/ScalableStaticAnalysisFramework/Analyses/SSAFAnalysesCommon.h
+25-1045 files

LLVM/project d583701libc/config/baremetal config.json

[libc] Enable baremetal printf float320 (#203421)

For memory constrained baremetal devices using float320 seems a
reasonable option by default.
DeltaFile
+3-0libc/config/baremetal/config.json
+3-01 files