LLVM/project cfca635llvm/test/CodeGen/AArch64 sve-fptosi-sat.ll, llvm/test/CodeGen/ARM fptoui-sat-scalar.ll

[SelectionDAG] Fix fptoui.sat expansion using minnum/maxnum (#180178)

fptoui.sat can currently use a minnum/maxnum based expansion, which
relies on NaNs not being propagated. Specifically, it relies on
minnum(maxnum(NaN, 0), MAX) to return 0. However, if the input is sNaN,
then maxnum(sNaN, 0) is allowed to return qNaN, in which case the final
result will be MAX rather than 0.

This PR does the following changes:

* Support the fold for minimumnum/maximumnum, which guarantees that NaN
is not propagated even for sNaN, so it can use the old lowering. Test
this using Hexagon which has legal minimumnum but illegal minnum.
* For the minnum/maxnum case, remove the special unsigned case and
instead always insert the explicit NaN check. In that case the NaN
propagation semantics don't matter.
* This also means that we can support this expansion for
minimum/maximum.
DeltaFile
+566-306llvm/test/CodeGen/Thumb2/mve-fptoui-sat-vector.ll
+212-132llvm/test/CodeGen/ARM/fptoui-sat-scalar.ll
+114-177llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
+247-0llvm/test/CodeGen/Hexagon/fptoi.sat.ll
+93-145llvm/test/CodeGen/AArch64/sve-fptosi-sat.ll
+86-152llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
+1,318-9126 files not shown
+1,657-1,39912 files

LLVM/project 5618119clang/test/OpenMP irbuilder_nested_parallel_for.c, llvm/test/MC/AMDGPU gfx13_asm_vopd3.s gfx13_asm_vop2.s

Merge branch 'main' into users/jmmartinez/spirv/reenable_float_controls2
DeltaFile
+16,004-0llvm/test/MC/AMDGPU/gfx13_asm_vopd3.s
+1,703-1,703clang/test/OpenMP/irbuilder_nested_parallel_for.c
+2,807-0llvm/test/MC/AMDGPU/gfx13_asm_vop2.s
+2,642-0llvm/test/MC/Disassembler/AMDGPU/gfx13_dasm_vop2.txt
+1,246-1,232llvm/test/MC/AMDGPU/gfx12_asm_vopcx.s
+2,269-0llvm/test/MC/AMDGPU/gfx13_asm_vop3_from_vop2.s
+26,671-2,9351,361 files not shown
+87,706-37,6511,367 files

LLVM/project 457625fllvm/lib/Target/SPIRV SPIRVEmitNonSemanticDI.cpp, llvm/test/CodeGen/SPIRV/debug-info debug-function.ll

[SPIRV] Add support for emitting DebugFunction debug info instructions

This commit adds support for emitting SPIRV DebugFunction and
DebugFunctionDefinition instructions for function definitions.
DeltaFile
+218-0llvm/lib/Target/SPIRV/SPIRVEmitNonSemanticDI.cpp
+40-0llvm/test/CodeGen/SPIRV/debug-info/debug-function.ll
+258-02 files

LLVM/project d82d261orc-rt/include/orc-rt ResourceManager.h SimpleNativeMemoryMap.h, orc-rt/lib/executor SimpleNativeMemoryMap.cpp Session.cpp

[orc-rt] Rename ResourceManager detach/shutdown. NFCI. (#183285)

These methods are called by the session in the event of a detach or
shutdown. The new names reflect their roles as event handlers.
DeltaFile
+4-4orc-rt/include/orc-rt/ResourceManager.h
+3-3orc-rt/unittests/SimpleNativeMemoryMapTest.cpp
+3-2orc-rt/lib/executor/SimpleNativeMemoryMap.cpp
+2-2orc-rt/include/orc-rt/SimpleNativeMemoryMap.h
+2-2orc-rt/unittests/SessionTest.cpp
+1-1orc-rt/lib/executor/Session.cpp
+15-146 files

LLVM/project f55a5cfclang/test/OpenMP irbuilder_nested_parallel_for.c nested_loop_codegen.cpp, llvm/test/Transforms/OpenMP parallel_region_merging.ll

[OpenMP] Only generate call to __kmpc_global_thread_num when needed (#182669)

This patch is a small optimization to only generate a call to
__kmpc_global_thread_num if the result is actually used.
DeltaFile
+1,703-1,703clang/test/OpenMP/irbuilder_nested_parallel_for.c
+402-402clang/test/OpenMP/nested_loop_codegen.cpp
+249-249clang/test/OpenMP/parallel_codegen.cpp
+185-182clang/test/OpenMP/cancel_codegen.cpp
+84-79llvm/test/Transforms/OpenMP/parallel_region_merging.ll
+17-16clang/test/OpenMP/taskgroup_codegen.cpp
+2,640-2,63112 files not shown
+2,680-2,68118 files

LLVM/project 91d5e9eclang/lib/CodeGen CGOpenCLRuntime.h CGOpenCLRuntime.cpp

[CGOpenCLRuntime] Remove dead code (#183093)

This drops one getPointerType() overload which accepted a name, which is
no longer used since the opaque pointers migration. The fallback code
path always returns a plain pointer now.

Also drop all the virtual qualifiers. Nothing inherits from this class.
Any customization is implemented via TargetCodeGenInfo hooks in the
implementation.
DeltaFile
+9-13clang/lib/CodeGen/CGOpenCLRuntime.h
+0-8clang/lib/CodeGen/CGOpenCLRuntime.cpp
+9-212 files

LLVM/project eda52e3llvm/lib/Transforms/Scalar LoopInterchange.cpp

address review comments
DeltaFile
+16-2llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+16-21 files

LLVM/project 3129c44llvm/test/Transforms/LoopInterchange profitability-instorder.ll

[LoopInterchange] Add a test for simple profitable case (NFC)
DeltaFile
+180-0llvm/test/Transforms/LoopInterchange/profitability-instorder.ll
+180-01 files

LLVM/project 5a22643llvm/lib/Transforms/Scalar LoopInterchange.cpp, llvm/test/Transforms/LoopInterchange profitability-instorder.ll interchangeable-outerloop-multiple-indvars.ll

[LoopInterchange] Fix instorder profitability check
DeltaFile
+50-41llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+40-30llvm/test/Transforms/LoopInterchange/profitability-instorder.ll
+1-1llvm/test/Transforms/LoopInterchange/interchangeable-outerloop-multiple-indvars.ll
+91-723 files

LLVM/project 678aaa7llvm/utils/release github-upload-release.py

[llvm][release] Note that some packages have 2 signature files (#183266)

For example in the latest release, there is:
LLVM-22.1.0-Linux-ARM64.tar.xz

Which has 2 signature files:
LLVM-22.1.0-Linux-ARM64.tar.xz.jsonl
LLVM-22.1.0-Linux-ARM64.tar.xz.sig

jsonl comes from the GitHub build and the sig is uploaded by the release
manager.
DeltaFile
+1-1llvm/utils/release/github-upload-release.py
+1-11 files

LLVM/project db5ffb0llvm/include/llvm/ExecutionEngine/Orc WaitingOnGraph.h

[ORC] WaitingOnGraph perf: faster dependence propagation. (#183272)

This commit replaces the core dependence propagation algorithm in
WaitingOnGraph to avoid worst-case behavior in the common case where
dependence graphs are sparse. This algorithm showed up as the underlying
cause of the bug in https://github.com/llvm/llvm-project/issues/179611.

For each call to MaterializationResponsibility::notifyEmitted,
WaitingOnGraph would build the transitive closure of all SuperNodes
whose "waiting on" relationships were affected by the newly emitted
symbols, then propagate any remaining unemitted dependencies through
this transitive closure graph. This approach is simple, but pushes the
algorithm towards n^2 complexity even for sparse dependence graphs.

The new propagation algorithm:
1. Inverts the edge direction in the SymbolDependenceMap data structure:
SymbolDepMap[SN] now contains the set of SuperNodes that depend on SN,
rather than the set that SN depends upon.


    [11 lines not shown]
DeltaFile
+100-57llvm/include/llvm/ExecutionEngine/Orc/WaitingOnGraph.h
+100-571 files

LLVM/project bf15949llvm/lib/Target/SPIRV SPIRVEmitNonSemanticDI.cpp, llvm/test/CodeGen/SPIRV/debug-info debug-function.ll

[SPIRV] Add support for emitting DebugFunction debug info instructions

This commit adds support for emitting SPIRV DebugFunction and
DebugFunctionDefinition instructions for function definitions.
DeltaFile
+219-0llvm/lib/Target/SPIRV/SPIRVEmitNonSemanticDI.cpp
+40-0llvm/test/CodeGen/SPIRV/debug-info/debug-function.ll
+259-02 files

LLVM/project d25b7f7llvm/lib/Target/ARM ARMFastISel.cpp

[NFC][CodeGen] Add Register guard to ARMMaterializeFP. (#182559)

This does not directly fix any issue because the implementation
indirectly ensures the correct behaviour. However, all the other
"<Tgt>Materialize" functions (Int and FP across all targets, including
ARMMaterializeInt) have explicit Register guards so for peace of mind I
figured it's worth added them.
DeltaFile
+3-0llvm/lib/Target/ARM/ARMFastISel.cpp
+3-01 files

LLVM/project 641c32ellvm/test/Transforms/LoopInterchange phi-ordering.ll

[LoopInterchange] Fix test phi-ordering.ll (NFC) (#181989)

I found that the test phi-ordering.ll is a bit fragile and can fail with
any irrelevant changes. Also this test is not consistent with the
following comment, which is at the top of the file:

```
;; Checks the order of the inner phi nodes does not cause havoc.
;; The inner loop has a reduction into c. The IV is not the first phi.
```

After examining the change history, I found that the original intent of
this test was effectively lost in
https://github.com/llvm/llvm-project/commit/c8bd6ea35e459169cbd401372e81168ed8482536.
A workaround was introduced later in
https://github.com/llvm/llvm-project/commit/eac34875109898ac01985f4afa937eec30c1c387
to preserve the test output, but this seems to have made the test more
complicated.


    [5 lines not shown]
DeltaFile
+34-32llvm/test/Transforms/LoopInterchange/phi-ordering.ll
+34-321 files

LLVM/project d81c6b5clang/test/CodeGen arm_acle.c builtins-arm64.c, clang/test/Sema/AArch64 pcdphint-atomic-store.c

fixup! Fix more PR comments
DeltaFile
+19-9clang/test/Sema/AArch64/pcdphint-atomic-store.c
+8-6llvm/test/CodeGen/AArch64/pcdphint-atomic-store.ll
+10-0clang/test/CodeGen/arm_acle.c
+0-9llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+5-0clang/test/CodeGen/builtins-arm64.c
+0-4llvm/include/llvm/IR/IntrinsicsAArch64.td
+42-282 files not shown
+44-328 files

LLVM/project 2a2f433clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup! Fix issues Kerry raised in PR
DeltaFile
+10-23clang/lib/Sema/SemaARM.cpp
+16-11clang/test/Sema/AArch64/pcdphint-atomic-store.c
+5-12clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1-5clang/include/clang/Basic/DiagnosticSemaKinds.td
+32-514 files

LLVM/project eea3d5eclang/include/clang/Basic BuiltinsAArch64.def, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup!

More small issues tidied, and remove gating.
DeltaFile
+6-2clang/test/Sema/AArch64/pcdphint-atomic-store.c
+2-2clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1-1clang/include/clang/Basic/BuiltinsAArch64.def
+0-2clang/lib/Headers/arm_acle.h
+1-1clang/lib/Sema/SemaARM.cpp
+1-1clang/test/CodeGen/AArch64/pcdphint-atomic-store.c
+11-96 files

LLVM/project 101d95ellvm/include/llvm/IR IntrinsicsAArch64.td, llvm/lib/Target/AArch64 AArch64InstrFormats.td

fixup! remove mayLoad/mayStore as suggested by Kerry
DeltaFile
+0-5llvm/lib/Target/AArch64/AArch64InstrFormats.td
+1-1llvm/include/llvm/IR/IntrinsicsAArch64.td
+1-62 files

LLVM/project 50b13c3llvm/lib/Target/AArch64 AArch64ISelLowering.cpp

fixup! Fix tests
DeltaFile
+2-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+2-01 files

LLVM/project 4e68717clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/test/CodeGen/AArch64 pcdphint-atomic-store.c

fixup! Ensure stshh always immediately precedes a store instruction
DeltaFile
+82-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+50-13clang/test/CodeGen/AArch64/pcdphint-atomic-store.c
+62-0llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+20-26clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+15-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+8-3llvm/lib/Target/AArch64/AArch64InstrFormats.td
+237-423 files not shown
+248-499 files

LLVM/project e041f65clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup! Fix Kerry's CR comments and add negative test for "must be an integer type"
DeltaFile
+16-6llvm/test/CodeGen/AArch64/pcdphint-atomic-store.ll
+3-7clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+3-3clang/lib/Sema/SemaARM.cpp
+5-0clang/test/Sema/AArch64/pcdphint-atomic-store.c
+3-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+1-1clang/lib/Headers/arm_acle.h
+31-176 files

LLVM/project 4720046clang/lib/Sema SemaARM.cpp, llvm/lib/Target/AArch64 AArch64InstrInfo.td AArch64InstrFormats.td

fixup! Address more helpful review comments from Kerry
DeltaFile
+160-0llvm/test/CodeGen/AArch64/pcdphint-atomic-store.ll
+4-4llvm/lib/Target/AArch64/AArch64InstrInfo.td
+1-4clang/lib/Sema/SemaARM.cpp
+0-5llvm/lib/Target/AArch64/AArch64InstrFormats.td
+165-134 files

LLVM/project c8fbd20clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup! Improve error diagnostics, and other cleanups
DeltaFile
+12-0llvm/test/CodeGen/AArch64/pcdphint-atomic-store.ll
+4-2clang/lib/Sema/SemaARM.cpp
+2-1clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1-1clang/test/Sema/AArch64/pcdphint-atomic-store.c
+1-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+2-0clang/lib/Headers/arm_acle.h
+22-56 files

LLVM/project e9a8d60clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/lib/Sema SemaARM.cpp

[AArch64][clang][llvm] Add ACLE `stshh` atomic store builtin

Add `__arm_atomic_store_with_stshh` implementation as defined
in the ACLE. Validate that the arguments passed are correct, and
lower it to the stshh intrinsic plus an atomic store with the
allowed orderings.

Gate this on FEAT_PCDPHINT so that availability matches
hardware support for the `STSHH` instruction. Use an i64
immediate and side-effect modeling to satisfy tablegen and decoding.
DeltaFile
+140-0clang/lib/Sema/SemaARM.cpp
+48-0clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+31-0clang/test/CodeGen/AArch64/pcdphint-atomic-store.c
+29-0clang/test/Sema/AArch64/pcdphint-atomic-store.c
+13-0llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
+10-2llvm/lib/Target/AArch64/AArch64InstrFormats.td
+271-25 files not shown
+298-211 files

LLVM/project 207e214clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/test/Sema/AArch64 pcdphint-atomic-store.c

fixup!

A few small tidyups
DeltaFile
+7-6clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+4-4llvm/lib/Target/AArch64/AArch64InstrFormats.td
+4-0clang/test/Sema/AArch64/pcdphint-atomic-store.c
+15-103 files

LLVM/project 58f4da4llvm/lib/Transforms/Utils SimplifyLibCalls.cpp, llvm/test/Transforms/InstCombine pow-1.ll

[SimplifyLibCalls] Avoid simplifying pow(x, 2.0) -> x * x with math-errno. (#183099)

It came up in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=123826 that
GCC was simplifying pow(x, 2.0) -> x * x, even when doing so caused
-fmath-errno to be ignored. This patch fixes a similar bug in LLVM.

For ConstantFolding folding powf expressions that may raise exceptions,
see #183102.
DeltaFile
+8-38llvm/test/Transforms/InstCombine/pow-1.ll
+1-1llvm/lib/Transforms/Utils/SimplifyLibCalls.cpp
+9-392 files

LLVM/project ae0978bllvm/lib/Transforms/IPO FunctionAttrs.cpp, llvm/test/Transforms/FunctionAttrs nofpclass-callsite-prop.ll nonnull.ll

FunctionAttrs: Propagate nofpclass from callsite arguments

Follow along with the nonnull handling. This is essentially the same,
except it can union with an existing attribute.

I'm wondering if getParamNoFPClass should have an AllowUndefOrPoison
argument to check noundef like nonnull. None of the uses of hasNonNullAttr
use this with true though, so maybe both should just check noundef.
DeltaFile
+83-0llvm/test/Transforms/FunctionAttrs/nofpclass-callsite-prop.ll
+26-13llvm/lib/Transforms/IPO/FunctionAttrs.cpp
+1-1llvm/test/Transforms/FunctionAttrs/nonnull.ll
+110-143 files

LLVM/project c75d1eaclang/lib/CIR/CodeGen CIRGenBuiltinAArch64.cpp, clang/test/CIR/CodeGenBuiltins/AArch64 acle_sve_dup.c

[CIR][AArch64] Add lowering + tests for predicated SVE svdup_lane builtins

This PR adds CIR lowering + tests for SVE `svdup_lane` builtins on
AArch64. The corresponding ACLE intrinsics are documented at:
https://developer.arm.com/architectures/instruction-sets/intrinsics
DeltaFile
+157-0clang/test/CIR/CodeGenBuiltins/AArch64/acle_sve_dup.c
+19-4clang/lib/CIR/CodeGen/CIRGenBuiltinAArch64.cpp
+176-42 files

LLVM/project d8f2334llvm/test/CodeGen/PowerPC clmul-vector.ll, llvm/test/CodeGen/RISCV clmul.ll clmulr.ll

Merge branch 'main' into users/kasuga-fj/loop-interchange-fix-test-phi-ordering
DeltaFile
+25,051-14,920llvm/test/CodeGen/RISCV/clmul.ll
+16,004-0llvm/test/MC/AMDGPU/gfx13_asm_vopd3.s
+13,198-0llvm/test/CodeGen/RISCV/clmulr.ll
+12,863-0llvm/test/CodeGen/RISCV/clmulh.ll
+8,874-0llvm/test/CodeGen/PowerPC/clmul-vector.ll
+3,298-3,437llvm/test/CodeGen/X86/vector-interleaved-store-i32-stride-7.ll
+79,288-18,3573,483 files not shown
+270,101-96,6143,489 files

LLVM/project 2fd0817llvm/lib/Target/AArch64 AArch64InstrInfo.cpp

[AArch64] Report accurate sizes for MOVaddr and MOVimm pseudos

getInstSizeInBytes returned the default 4 bytes for MOVaddr*,
MOVi32imm and MOVi64imm pseudos, which doesn't reflect their
expanded size. Compute the real sizes: 8 or 12 bytes for MOVaddr*
(depending on MO_TAGGED), and the actual expansion count for
MOVi32imm/MOVi64imm using AArch64_IMM::expandMOVImm.
DeltaFile
+22-0llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+22-01 files