LLVM/project b6e4d27mlir/include/mlir/Interfaces/Utils MemorySlotUtils.h, mlir/lib/Dialect/SCF/IR MemorySlot.cpp

[MLIR][Mem2Reg] Extract shared utilities for PromotableRegionOpInterface (#188514)

The `PromotableRegionOpInterface` implementations use two helpers that
are likely useful for other dialects implementing this interface as
well:
- `updateTerminator`: Appends the reaching definition as an operand to a
block's terminator, falling back to a default when the block has no
entry (e.g. dead code).
- `replaceWithNewResults`: Clones an operation with additional result
types while preserving its regions, then replaces the original.

This PR extracts them into a common utility header so that downstream
dialects can reuse them directly.
I'm open to discussion about the location of these utilities.
DeltaFile
+188-0mlir/unittests/Interfaces/MemorySlotUtilsTest.cpp
+21-63mlir/lib/Dialect/SCF/IR/MemorySlot.cpp
+51-0mlir/lib/Interfaces/Utils/MemorySlotUtils.cpp
+36-0mlir/include/mlir/Interfaces/Utils/MemorySlotUtils.h
+15-0mlir/lib/Interfaces/Utils/CMakeLists.txt
+7-0mlir/unittests/Interfaces/CMakeLists.txt
+318-631 files not shown
+319-637 files

LLVM/project 2cf73afllvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-crossing-siv-addrec-wrap.ll weak-crossing-siv-delta-signed-min.ll

fix testcases after rebase and address conflict resolution issue
DeltaFile
+0-24llvm/lib/Analysis/DependenceAnalysis.cpp
+7-17llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-addrec-wrap.ll
+2-2llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-delta-signed-min.ll
+1-1llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-overflow.ll
+1-1llvm/test/Analysis/DependenceAnalysis/WeakCrossingSIV.ll
+1-1llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-large-btc.ll
+12-466 files

LLVM/project 06725d7llvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp, llvm/lib/CodeGen/SelectionDAG LegalizeIntegerTypes.cpp

[GISel] Keep non-negative info in SUB(CTLZ) (#189314)

Implement non-negative value tracking for SUB-CTLZ chains in GlobalISel,
matching the behavior previously added to SelectionDAG.

Additionally, refactor the SelectionDAG implementation from the previous
patch to improve performance and code density.

Related to https://github.com/llvm/llvm-project/issues/136516 and
https://github.com/llvm/llvm-project/pull/186338#discussion_r2980420174
DeltaFile
+12-28llvm/test/CodeGen/AArch64/cls.ll
+4-4llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-ctlz.mir
+4-4llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-ctlz-rv64.mir
+4-4llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-ctlz-rv32.mir
+6-1llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+3-4llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
+33-459 files not shown
+49-6215 files

LLVM/project e61d016llvm/lib/IR AutoUpgrade.cpp, llvm/test/Bitcode amdgpu-wmma-drop-ab-mods-upgrade.ll

[AMDGPU] Drop A and B neg modifier from amdgcn_wmma_bf16_16x16x32_bf16

Fixes: LCOMPILER-1673
DeltaFile
+6-46llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imod.gfx1250.w32.ll
+10-10llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+13-0llvm/test/Bitcode/amdgpu-wmma-drop-ab-mods-upgrade.ll
+7-3llvm/lib/IR/AutoUpgrade.cpp
+4-4llvm/test/CodeGen/AMDGPU/llvm.amdgcn.wmma.imm.gfx1250.w32.ll
+3-5mlir/include/mlir/Dialect/LLVMIR/ROCDLOps.td
+43-6810 files not shown
+60-8816 files

LLVM/project 26e0d15llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/PhaseOrdering/X86 avg.ll

[SLP] Prefer to trim equal-cost alternate-shuffle subtrees

If the trimming candidate subtree is rooted at an alternate-shuffle node
with binary ops, and this subtree has the same cost as the buildvector
node cost, better to stick with the buildvector node to avoid runtime
perf regressions from shuffle/extra operations  overhead that the cost model may
underestimate. Skip trimming if the subtree contains ExtractElement
nodes, since those operate on already-materialized vectors, which may
reduced vector-to-scalar code movement and have better perf.

Reviewers: hiraditya, bababuck, fhahn, RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/188272
DeltaFile
+21-3llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+10-12llvm/test/Transforms/SLPVectorizer/AArch64/unprofitable-alternate-subtree.ll
+7-7llvm/test/Transforms/PhaseOrdering/X86/avg.ll
+38-223 files

LLVM/project 55f79faclang/test/OpenMP target_teams_distribute_parallel_for_simd_schedule_codegen.cpp teams_distribute_parallel_for_simd_schedule_codegen.cpp, libc/AOR_v20.02/math/test/traces sincosf.txt exp.txt

Merge branch 'main' into users/amehsan/weakc-delta-overflow
DeltaFile
+0-31,999libc/AOR_v20.02/math/test/traces/sincosf.txt
+0-16,000libc/AOR_v20.02/math/test/traces/exp.txt
+6,911-6,946llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+6,432-6,562llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-7.ll
+5,294-4,814clang/test/OpenMP/target_teams_distribute_parallel_for_simd_schedule_codegen.cpp
+5,238-4,758clang/test/OpenMP/teams_distribute_parallel_for_simd_schedule_codegen.cpp
+23,875-71,07910,718 files not shown
+673,313-424,81210,724 files

LLVM/project 804ece6llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis WeakCrossingSIV.ll weak-crossing-siv-addrec-wrap.ll

[DA] Require `nsw` for AddRecs in the WeakCrossing SIV test (#185041)

Before the start of the algorithm in weak crossing SIV test, we need to
ensure both addrecs are `nsw`
DeltaFile
+2-2llvm/test/Analysis/DependenceAnalysis/WeakCrossingSIV.ll
+1-3llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-addrec-wrap.ll
+3-0llvm/lib/Analysis/DependenceAnalysis.cpp
+6-53 files

LLVM/project 6021270utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[Bazel] Fixes 04785ad (#189456)

This fixes 04785adec34ddf9a6ec47f10da5b2b7fe8c9f9c8.

Co-authored-by: Google Bazel Bot <google-bazel-bot at google.com>
DeltaFile
+1-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+1-01 files

LLVM/project e77104bclang/lib/Driver/ToolChains Clang.cpp

clang: Use MakeArgStringRef more often

Avoid an intermediate copy by using MakeArgStringRef. Also
use better use of Twine with MakeArgString.
DeltaFile
+9-12clang/lib/Driver/ToolChains/Clang.cpp
+9-121 files

LLVM/project b660fe1clang/lib/Driver/ToolChains Clang.cpp

clang: Reorder linker aux-triple handling

Move the IsCuda check out from the IsCuda || isHIP block. Keep
this from splitting the aux-triple handling for future convenience.
DeltaFile
+20-19clang/lib/Driver/ToolChains/Clang.cpp
+20-191 files

LLVM/project 23f95fallvm/include/llvm/ABI FunctionInfo.h

[LLVM] Fix invalid shadowed type name
DeltaFile
+7-7llvm/include/llvm/ABI/FunctionInfo.h
+7-71 files

LLVM/project 15a7c45libc/shared/math asinbf16.h, libc/src/__support/math asinbf16.h

[libc][math][c23] Add asinbf16 math function (#184170)

Co-authored-by: bassiounix <muhammad.m.bassiouni at gmail.com>
DeltaFile
+95-0libc/src/__support/math/asinbf16.h
+43-0libc/test/src/math/asinbf16_test.cpp
+41-0libc/test/src/math/smoke/asinbf16_test.cpp
+23-0libc/shared/math/asinbf16.h
+22-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+21-0libc/src/math/asinbf16.h
+245-027 files not shown
+355-833 files

LLVM/project f10dccdmlir/lib/Dialect/SparseTensor/IR/Detail LvlTypeParser.cpp DimLvlMapParser.cpp

[MLIR][SparseTensor] Add #undef FAILURE_IF_FAILED and ERROR_IF (#188685)

Both DimLvlMapParser.cpp and LvlTypeParser.cpp define FAILURE_IF_FAILED
and ERROR_IF macros that are never undefined, which can leak into
subsequent translation units in unity builds. Add #undef at the end of
each file. See
https://discourse.llvm.org/t/rfc-enabling-unity-build/90306 for more
info.

"clauded" not coded
DeltaFile
+3-0mlir/lib/Dialect/SparseTensor/IR/Detail/LvlTypeParser.cpp
+3-0mlir/lib/Dialect/SparseTensor/IR/Detail/DimLvlMapParser.cpp
+6-02 files

LLVM/project 03869c7mlir/lib/Dialect/SparseTensor/Transforms/Utils SparseTensorIterator.cpp LoopEmitter.cpp

[MLIR][SparseTensor] Add missing #undef REMUI and DIVUI (#188686)

LoopEmitter.cpp and SparseTensorIterator.cpp define REMUI and DIVUI
macros but the existing #undef block at the end of each file omits them.
This can leak the macros into subsequent translation units in unity
builds. See https://discourse.llvm.org/t/rfc-enabling-unity-build/90306
for more info.

"clauded" not coded
DeltaFile
+2-0mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorIterator.cpp
+2-0mlir/lib/Dialect/SparseTensor/Transforms/Utils/LoopEmitter.cpp
+4-02 files

LLVM/project 0d2c59aclang/lib/Headers gpuintrin.h nvptxintrin.h, clang/test/Headers gpuintrin_lang.c

[Clang] Fix constant bit widths in gpuintrin.h (#189387)

Summary:
The `ull` suffix can mean 128 bits on some architectures. Replace this
with the `stdint.h` constructor to be certain.
DeltaFile
+12-13clang/lib/Headers/gpuintrin.h
+3-0clang/test/Headers/Inputs/include/stdint.h
+1-1clang/lib/Headers/nvptxintrin.h
+1-0clang/test/Headers/gpuintrin_lang.c
+17-144 files

LLVM/project 7364203llvm/lib/Target/AMDGPU AMDGPUCoExecSchedStrategy.cpp AMDGPUCoExecSchedStrategy.h, llvm/test/CodeGen/AMDGPU coexec-scheduler.ll coexec-sched-effective-stall.mir

Reapply "[AMDGPU] Add HWUI pressure heuristics to coexec strategy (#184929)" (#189121)

Reland https://github.com/llvm/llvm-project/pull/184929 after fixing
some issues in the NDEBUG builds.

3a640ee is unchanged from the previously approved PR, the unreviewed
portion of this PR is 9cabd8d
DeltaFile
+606-0llvm/test/CodeGen/AMDGPU/coexec-scheduler.ll
+423-23llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.cpp
+288-2llvm/lib/Target/AMDGPU/AMDGPUCoExecSchedStrategy.h
+5-5llvm/test/CodeGen/AMDGPU/coexec-sched-effective-stall.mir
+1,322-304 files

LLVM/project a6ffdb5clang/lib/Headers gpuintrin.h, clang/test/Headers gpuintrin.c

[Clang] Improve scan in gpuintrin.h (#189381)

Summary:
Right now the scan checks to avoid the unspecified behavior in
`clzg(0)`. This is used as the source to the shuffle instruction, but
the argument is discarded at zero anyway. So, we simply pass unspecified
behavior to shuffle and then discard it. This should be fine. The scan
routines are expected to be optimal.

Also renames `sum` to `add`.
DeltaFile
+148-172clang/test/Headers/gpuintrin.c
+6-7clang/lib/Headers/gpuintrin.h
+2-2libc/src/__support/GPU/utils.h
+156-1813 files

LLVM/project c7ba9bbflang/lib/Semantics check-omp-loop.cpp, llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Address comments

Created using spr 1.3.7
DeltaFile
+13-6llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+0-9flang/lib/Semantics/check-omp-loop.cpp
+13-152 files

LLVM/project 651b61fllvm/test/CodeGen/Hexagon gen-pred-andn-orn.ll sched-timing-classes.ll

[Hexagon] Add coverage tests for CodeGen analysis and optimization passes (#183952)

Add tests targeting Hexagon CodeGen analysis and optimization passes:

- gen-pred-andn-orn.ll: HexagonGenPredicate pass exercising andn/orn
logical operations, cmp-zero conversion paths, deeper predicate chains,
and byte comparison classification.

- memcpy-likely-aligned.ll: HexagonSelectionDAGInfo exercising the
aligned memcpy specialization path.

- constprop-fp-cmp.ll: HexagonConstPropagation exercising floating-
point comparison constant folding paths.

- sched-timing-classes.ll: Scheduling timing class coverage for various
Hexagon instruction classes.
DeltaFile
+129-0llvm/test/CodeGen/Hexagon/gen-pred-andn-orn.ll
+78-0llvm/test/CodeGen/Hexagon/sched-timing-classes.ll
+27-20llvm/test/CodeGen/Hexagon/memcpy-likely-aligned.ll
+45-0llvm/test/CodeGen/Hexagon/constprop-fp-cmp.ll
+279-204 files

LLVM/project ba22818lld/ELF/Arch Hexagon.cpp, lld/test/ELF hexagon-thunk-range-plt.s hexagon-thunks-packets.s

[lld][Hexagon] Fix out-of-range PLT branch thunks (#186545)

Linking large Hexagon binaries (e.g. ASan runtime with >8 MiB of text)
fails with R_HEX_B22_PCREL / R_HEX_PLT_B22_PCREL relocation overflow on
calls to PLT entries, even though the thunk infrastructure exists and
needsThunks is set.

needsThunk() always used s.getVA() to compute the branch destination,
even for PLT calls where the actual destination is the PLT entry. This
meant the distance check used the wrong address and failed to create
thunks when the PLT entry was out of B22_PCREL range.

Fix by using s.getPltVA() when expr == R_PLT_PC. Also override
getThunkSectionSpacing() so ThunkSections are pre-created at appropriate
intervals for large binaries.
DeltaFile
+38-51lld/test/ELF/hexagon-thunk-range-plt.s
+42-38lld/test/ELF/hexagon-thunks-packets.s
+9-11lld/test/ELF/hexagon-thunks.s
+12-2lld/ELF/Arch/Hexagon.cpp
+101-1024 files

LLVM/project 04785adllvm/include/llvm/ABI FunctionInfo.h, llvm/lib/ABI FunctionInfo.cpp CMakeLists.txt

[LLVMABI] Create ABI Utils (#185105)

This PR introduces `ABIFunctionInfo` and surrounding utility helpers,
and is part of the set of breakout PRs to upstream the LLVM ABI lowering
library prototyped in https://github.com/llvm/llvm-project/pull/140112.

`ABIFunctionInfo` is directly analogous to `CGFunctionInfo` from Clang's
existing CodeGen pipeline, and represents an ABI lowered view of the
function signature, decoupled from both the Clang AST and LLVM IR.

`ABIArgInfo` encodes lowering decisions and currently supports
Direct,Extend,Indirect and Ignore which are required for our initial
goal of implementing x86-64 SysV and BPF, but this will change as the
library grows to represent more targets that need them.

This PR is a direct precursor to the implementation of `ABIInfo` in the
library as demonstrated in the PR linked above..
DeltaFile
+269-0llvm/include/llvm/ABI/FunctionInfo.h
+31-0llvm/lib/ABI/FunctionInfo.cpp
+1-0llvm/lib/ABI/CMakeLists.txt
+301-03 files

LLVM/project 14ab059llvm/lib/Target/AMDGPU AMDGPUTargetTransformInfo.cpp, llvm/test/Analysis/CostModel/AMDGPU log10.ll log.ll

[AMDGPU][TTI] Update cost model for transcendental instructions to be more precise (#189430)

Introduce `getTransInstrCost` instead of `getQuarterRateInstrCost` for transcendental ops
DeltaFile
+380-248llvm/test/Analysis/CostModel/AMDGPU/log10.ll
+380-248llvm/test/Analysis/CostModel/AMDGPU/log.ll
+218-218llvm/test/Analysis/CostModel/AMDGPU/sqrt.ll
+202-202llvm/test/Analysis/CostModel/AMDGPU/log2.ll
+236-104llvm/test/Analysis/CostModel/AMDGPU/sin.ll
+83-10llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
+1,499-1,0301 files not shown
+1,501-1,0307 files

LLVM/project 5c9440fflang/lib/Semantics check-omp-loop.cpp

[flang][OpenMP] Remove misplaced comment, NFC (#189449)

Remove the seemingly random comment listing clauses allowed on a DO
construct. The nearby code has nothing to do with clauses.
DeltaFile
+0-9flang/lib/Semantics/check-omp-loop.cpp
+0-91 files

LLVM/project 5d836e9llvm/test/Analysis/DependenceAnalysis weak-crossing-siv-addrec-wrap.ll WeakCrossingSIV.ll

update testcases
DeltaFile
+9-18llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-addrec-wrap.ll
+1-1llvm/test/Analysis/DependenceAnalysis/WeakCrossingSIV.ll
+10-192 files

LLVM/project 9766d6dclang-tools-extra/test/clang-doc enum.cpp, clang/lib/Sema HLSLBuiltinTypeDeclBuilder.cpp

Fix formatting

Created using spr 1.3.7
DeltaFile
+464-226clang-tools-extra/test/clang-doc/enum.cpp
+296-190llvm/test/CodeGen/AMDGPU/llvm.amdgcn.tanh.ll
+428-0llvm/test/Transforms/InstCombine/fcmp-select-sign.ll
+199-191libc/test/shared/shared_math_test.cpp
+299-44clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp
+330-0clang/test/CodeGenObjC/expose-direct-method.m
+2,016-651333 files not shown
+8,094-2,924339 files

LLVM/project ef93c9ellvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 scatter-vectorize-reorder-non-empty.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+346-5llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+8-9llvm/test/Transforms/SLPVectorizer/X86/scatter-vectorize-reorder-non-empty.ll
+354-142 files

LLVM/project 6c5f280flang/lib/Semantics check-omp-loop.cpp

[flang][OpenMP] Remove misplaced comment, NFC

Remove the seemingly random comment listing clauses allowed on a DO
construct. The nearby code has nothing to do with clauses.
DeltaFile
+0-9flang/lib/Semantics/check-omp-loop.cpp
+0-91 files

LLVM/project 0b500d5llvm/include/llvm/Support KnownFPClass.h, llvm/lib/Analysis ValueTracking.cpp

[Support] Move `KnownFPClass` inference from `KnownBits` to Support (#189414)

Move logic for inferring `KnownFPClass` from known bits into the Support
library so the logic may be used e.g., for analogous value tracking
functions in SelectionDAG.
DeltaFile
+53-0llvm/lib/Support/KnownFPClass.cpp
+1-44llvm/lib/Analysis/ValueTracking.cpp
+4-0llvm/include/llvm/Support/KnownFPClass.h
+58-443 files

LLVM/project 03cc2a3. .mailmap

[mailmap] Add mailmap entry for myself (#189447)
DeltaFile
+1-0.mailmap
+1-01 files

LLVM/project db80420llvm/lib/Target/PowerPC PPCISelLowering.cpp

[PowerPC] Respect chain operand for llvm.ppc.disassemble.dmr lowering (#188334)

Fix ignoring the input chain when turning llvm.ppc.disassemble.dmr into
a store.
DeltaFile
+3-1llvm/lib/Target/PowerPC/PPCISelLowering.cpp
+3-11 files