LLVM/project 8a11fe9llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis Propagating.ll Banerjee.ll

[DA] Require `nsw` for AddRecs involved in GCD test (#186892)

Similar to other tests, we are adding code that the AddRecs used in GCD
test are `nsw`. In this case, all recursively identified `AddRec`s are
also checked. Note that there is already a similar check in
`getConstantCoefficient` for expressions processed in that function.
DeltaFile
+3-3llvm/test/Analysis/DependenceAnalysis/Propagating.ll
+2-2llvm/test/Analysis/DependenceAnalysis/Banerjee.ll
+2-0llvm/lib/Analysis/DependenceAnalysis.cpp
+1-1llvm/test/Analysis/DependenceAnalysis/exact-rdiv-addrec-wrap.ll
+8-64 files

LLVM/project 62ce560lldb/source/Plugins/Disassembler/LLVMC DisassemblerLLVMC.cpp

[lldb] Remove some unreachable code (NFC) (#190529)

`isRISCV()` check always returns false because we only get here if
`min_op_byte_size` and `max_op_byte_size` are equal, which is not true
for RISC-V.
Also, replase `if (!got_op)` check with an `else`. The check is
equivalent to
`if (min_op_byte_size != max_op_byte_size)`, and the `if` above checks
for the opposite condition.
DeltaFile
+3-15lldb/source/Plugins/Disassembler/LLVMC/DisassemblerLLVMC.cpp
+3-151 files

LLVM/project 020b3b2clang/test/Analysis/Scalable/ssaf-format list.test

Update clang/test/Analysis/Scalable/ssaf-format/list.test

Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
DeltaFile
+2-2clang/test/Analysis/Scalable/ssaf-format/list.test
+2-21 files

LLVM/project ef71584llvm/lib/Target/AMDGPU SIMemoryLegalizer.cpp

[NFC][AMDGPU] Add some debug prints to SIMemoryLegalizer (#190658)
DeltaFile
+69-0llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
+69-01 files

LLVM/project 7087ecemlir/lib/ExecutionEngine CudaRuntimeWrappers.cpp, mlir/test/Integration/GPU/CUDA async.mlir

[MLIR][ExecutionEngine] Tolerate CUDA_ERROR_DEINITIALIZED in mgpuModuleUnload (#190563)

`mgpuModuleUnload` may be called from a global destructor (registered by
`SelectObjectAttr`'s `appendToGlobalDtors`) after the CUDA primary
context has already been destroyed during program shutdown. In this
case, `cuModuleUnload` returns `CUDA_ERROR_DEINITIALIZED`, which is
benign since the module's resources are already freed with the context.

## Reproduction

Any program that uses `gpu.launch_func` and is AOT-compiled (via
`mlir-translate --mlir-to-llvmir | llc | cc -lmlir_cuda_runtime`) will
print `'cuModuleUnload(module)' failed with '<unknown>'` on exit. This
is because `SelectObjectAttr` registers the module unload as a global
destructor, which runs after the CUDA primary context is released.

This script reproduces the error message from `mgpuModuleUnload` on my
system:


    [40 lines not shown]
DeltaFile
+29-5mlir/lib/ExecutionEngine/CudaRuntimeWrappers.cpp
+2-5mlir/test/Integration/GPU/CUDA/async.mlir
+31-102 files

LLVM/project af95b0allvm/test/CodeGen/AMDGPU llvm.exp10.f64.ll llvm.exp2.f64.ll, llvm/test/CodeGen/AMDGPU/GlobalISel insertelement.i16.ll atomicrmw_uinc_wrap.ll

[AMDGPU] Remove implicit super-reg defs on mov64 pseudos   (#190379)

The mov64 pseudo is split into two 32 bit movs, but those 32 bit movs
had the full 64-bit register still implicitly defined. VOPD formation is
affected, so we can emit more of them.
DeltaFile
+279-279llvm/test/CodeGen/AMDGPU/llvm.exp10.f64.ll
+254-254llvm/test/CodeGen/AMDGPU/llvm.exp2.f64.ll
+166-278llvm/test/CodeGen/AMDGPU/fcanonicalize.ll
+184-184llvm/test/CodeGen/AMDGPU/llvm.exp.f64.ll
+147-174llvm/test/CodeGen/AMDGPU/GlobalISel/insertelement.i16.ll
+121-135llvm/test/CodeGen/AMDGPU/GlobalISel/atomicrmw_uinc_wrap.ll
+1,151-1,30490 files not shown
+2,283-2,60696 files

LLVM/project 9bddf47mlir/lib/Dialect/XeGPU/Transforms XeGPUWgToSgDistribute.cpp, mlir/test/Dialect/XeGPU xegpu-wg-to-sg-unify-ops.mlir xegpu-wg-to-sg-unify-ops-rr.mlir

[MLIR][XeGPU] Extend Wg-to-Sg Distribution of Multi-Reduction Op for round-robin layout (#189988)

This PR enhance the multi-reduction op pattern of wg-to-sg distribution
pass:
1. allows each sg have multiple distribution of sg_data tiles.
2. expand the slm buffer size.
3. construct the layout based on the partial reduced vector and use
layout.computeDistributedCoords() to compute coordinates. the layout is
constructed so that the store is cooperative, and load overlapps with
neighbour threads.
4. perform save and load.
DeltaFile
+63-76mlir/lib/Dialect/XeGPU/Transforms/XeGPUWgToSgDistribute.cpp
+8-68mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-unify-ops.mlir
+52-0mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-unify-ops-rr.mlir
+123-1443 files

LLVM/project 9ab2b6dllvm/lib/CAS DatabaseFile.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+1-1llvm/lib/CAS/DatabaseFile.cpp
+1-11 files

LLVM/project 97d50c1llvm/lib/Target/AArch64 AArch64.h AArch64PassRegistry.def, llvm/lib/Target/AArch64/GISel AArch64PreLegalizerCombiner.cpp

[NewPM] Adds a port for AArch64PreLegalizerCombiner (#190567)

Standard porting (note that TargetPassConfig dependency was [removed
earlier](https://github.com/llvm/llvm-project/commit/e27e7e433974b24c90fed9f0b646bed84e47681e)).

---------

Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
DeltaFile
+92-30llvm/lib/Target/AArch64/GISel/AArch64PreLegalizerCombiner.cpp
+15-1llvm/lib/Target/AArch64/AArch64.h
+2-0llvm/lib/Target/AArch64/AArch64PassRegistry.def
+1-1llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
+1-0llvm/test/CodeGen/AArch64/GlobalISel/combine-2-icmps-of-0-and-or.mir
+111-325 files

LLVM/project ee51de9llvm/include/llvm/ProfileData/Coverage CoverageMapping.h, llvm/lib/ProfileData/Coverage CoverageMapping.cpp

[llvm-cov] add ability to show non executed test vectors for mc/dc coverage (#187517)

- Added `-show-mcdc-non-executed-vectors` option
- Non-executed test vectors now are tracked
- When the opt is present it's get written to UI
DeltaFile
+145-2llvm/test/tools/llvm-cov/mcdc-const.test
+132-4llvm/test/tools/llvm-cov/mcdc-general.test
+133-3llvm/test/tools/llvm-cov/mcdc-macro.test
+122-2llvm/test/tools/llvm-cov/mcdc-general-none.test
+64-35llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
+37-23llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+633-696 files not shown
+774-10412 files

LLVM/project d917027llvm/lib/ProfileData/Coverage CoverageMapping.cpp, llvm/test/tools/llvm-cov main-view-fileid-regression.test

[llvm-cov] Guard against empty CountedRegions in findMainViewFileID (#189270)

When processing coverage generated from branch coverage mode, some
functions can reach findMainViewFileID with an empty CountedRegions
list. In that case the current logic still proceeds to infer the main
view file, even though there is no regular counted region available to
do so.

Return std::nullopt early when CountedRegions is empty.

This was observed when reproducing issue #189169 with:
  cargo llvm-cov --lib --branch

The issue appears related to branch-only coverage information being
recorded separately in CountedBranchRegions, while
findMainViewFileID currently only consults CountedRegions.
This patch is a defensive fix for the empty-region case; further
investigation may still be needed to determine whether branch regions
should participate in main view file selection.

Co-authored-by: Zile Xiong <xiongzile99 at gmail.com>
DeltaFile
+32-0llvm/test/tools/llvm-cov/Inputs/main-view-fileid-regression.proftext
+16-0llvm/test/tools/llvm-cov/main-view-fileid-regression.test
+2-0llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+0-0llvm/test/tools/llvm-cov/Inputs/main-view-fileid-regression.covmapping
+50-04 files

LLVM/project 9033e87llvm/lib/Target/AMDGPU AMDGPURegBankLegalizeRules.cpp, llvm/test/CodeGen/AMDGPU/GlobalISel llvm.amdgcn.update.dpp.ll regbankselect-amdgcn.update.dpp.mir

[AMDGPU][GISel] RegBankLegalize rules for update_dpp (#190662)
DeltaFile
+3-3llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.update.dpp.ll
+4-0llvm/lib/Target/AMDGPU/AMDGPURegBankLegalizeRules.cpp
+1-1llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.update.dpp.mir
+8-43 files

LLVM/project 9f30927clang/test/Headers __clang_hip_math.hip, llvm/test/CodeGen/AMDGPU clmul.ll integer-mad-patterns.ll

rebase

Created using spr 1.3.4
DeltaFile
+3,666-5,073llvm/test/CodeGen/RISCV/rvv/expandload.ll
+4,371-0llvm/test/CodeGen/AMDGPU/clmul.ll
+1,103-1,014clang/test/Headers/__clang_hip_math.hip
+1,318-117llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+835-387llvm/test/CodeGen/AMDGPU/fcanonicalize.bf16.ll
+440-640llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
+11,733-7,2311,387 files not shown
+49,773-26,5641,393 files

LLVM/project 260a784clang/test/Headers __clang_hip_math.hip, llvm/test/CodeGen/AMDGPU clmul.ll integer-mad-patterns.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.4

[skip ci]
DeltaFile
+3,666-5,073llvm/test/CodeGen/RISCV/rvv/expandload.ll
+4,371-0llvm/test/CodeGen/AMDGPU/clmul.ll
+1,103-1,014clang/test/Headers/__clang_hip_math.hip
+1,318-117llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+835-387llvm/test/CodeGen/AMDGPU/fcanonicalize.bf16.ll
+440-640llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
+11,733-7,2311,387 files not shown
+49,773-26,5641,393 files

LLVM/project 8966581llvm/include/llvm/Analysis BlockFrequencyInfoImpl.h

[Analysis][NFC] Use block numbers in BlockFrequencyInfo (#190669)

Block pointers are only stored while constructing the analysis, so the
value handle to catch erased blocks is no longer needed when using
stable block numbers.
DeltaFile
+59-109llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
+59-1091 files

LLVM/project 92b595bflang/lib/Semantics check-cuda.cpp, flang/test/Semantics cuf25.cuf

[flang][cuda] Take associate into account for host array diagnostic (#190673)
DeltaFile
+22-0flang/test/Semantics/cuf25.cuf
+4-3flang/lib/Semantics/check-cuda.cpp
+26-32 files

LLVM/project f5c3fa2llvm/test/CodeGen/AMDGPU memory-legalizer-private-wavefront.ll memory-legalizer-private-workgroup.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.4
DeltaFile
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-wavefront.ll
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-workgroup.ll
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-singlethread.ll
+8,449-1,355llvm/test/CodeGen/AMDGPU/memory-legalizer-private-cluster.ll
+8,449-1,355llvm/test/CodeGen/AMDGPU/memory-legalizer-private-agent.ll
+8,069-1,315llvm/test/CodeGen/AMDGPU/memory-legalizer-private-system.ll
+50,599-8,1234,113 files not shown
+370,875-117,1544,119 files

LLVM/project d804375llvm/test/CodeGen/AMDGPU memory-legalizer-private-workgroup.ll memory-legalizer-private-wavefront.ll

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.4

[skip ci]
DeltaFile
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-workgroup.ll
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-wavefront.ll
+8,544-1,366llvm/test/CodeGen/AMDGPU/memory-legalizer-private-singlethread.ll
+8,449-1,355llvm/test/CodeGen/AMDGPU/memory-legalizer-private-cluster.ll
+8,449-1,355llvm/test/CodeGen/AMDGPU/memory-legalizer-private-agent.ll
+8,069-1,315llvm/test/CodeGen/AMDGPU/memory-legalizer-private-system.ll
+50,599-8,1234,110 files not shown
+370,857-117,1524,116 files

LLVM/project fbe6d79llvm/include/llvm/Transforms/Utils CodeMoverUtils.h, llvm/lib/Transforms/Scalar LoopFuse.cpp

[LoopFusion] Fix out-of-date LoopInfo being used during fusion (#189452)

This is fix for
[187902](https://github.com/llvm/llvm-project/issues/187902), where
`LoopInfo` is not in a valid state at the beginning of `ScalarEvolution::createSCEVIter`.

The reason for the bug is that, `mergeLatch()` is called at a place
where control flow and dominator trees have been updated but `LoopInfo`
has not completed the update yet. `mergeLatch()` calls into
`ScalarEvolution` that uses `LoopInfo`, where out-of-date `LoopInfo` would
result in crash or unpredictable results.

This patch moves `mergeLatch()` to the place where `LoopInfo` has
completed its update and hence is in a valid state.
DeltaFile
+22-26llvm/lib/Transforms/Scalar/LoopFuse.cpp
+11-4llvm/lib/Transforms/Utils/CodeMoverUtils.cpp
+6-7llvm/include/llvm/Transforms/Utils/CodeMoverUtils.h
+39-373 files

LLVM/project 1a0ca10llvm/include/llvm/CAS MappedFileRegionArena.h, llvm/lib/CAS OnDiskTrieRawHashMap.cpp MappedFileRegionArena.cpp

[CAS] Harden validate() against on-disk corruption (#190634)

Fixes found by fuzzer:

OnDiskTrieRawHashMap:
- Bounds-check data slot offsets in TrieVerifier::visitSlot() before
  calling getRecord(), preventing asData() assertion on out-of-bounds
  trie entries.
- Validate subtrie headers (NumBits, bounds) before constructing
  SubtrieHandle, preventing SEGV in getSlots() from corrupt NumBits.
- Validate arena bump pointer alignment, catching misaligned BumpPtr
  that would crash store() with an alignment assertion.
- Fix comma operator bug in getOrCreateRoot() where the
  compare_exchange_strong result was discarded, causing asSubtrie()
  assertion when RootTrieOffset was corrupted to zero.

OnDiskGraphDB:
- Reject invalid (zero) ref offsets in validate callback, preventing
  asData() assertion when corrupt data pool refs are resolved via

    [12 lines not shown]
DeltaFile
+49-2llvm/lib/CAS/OnDiskTrieRawHashMap.cpp
+23-9llvm/lib/CAS/MappedFileRegionArena.cpp
+23-1llvm/lib/CAS/OnDiskGraphDB.cpp
+6-1llvm/lib/CAS/DatabaseFile.cpp
+1-1llvm/include/llvm/CAS/MappedFileRegionArena.h
+102-145 files

LLVM/project 70d3dcallvm/include/llvm/Transforms/Utils Cloning.h, llvm/lib/Transforms/IPO Inliner.cpp

Revert "[Inliner] Put inline history into IR as !inline_history metadata" (#190666)

Reverts llvm/llvm-project#190092

Crashes reported in
https://github.com/llvm/llvm-project/pull/190092#issuecomment-4194546908
DeltaFile
+0-102llvm/test/Transforms/Inline/inline-history.ll
+28-57llvm/lib/Transforms/Utils/InlineFunction.cpp
+36-25llvm/lib/Transforms/IPO/Inliner.cpp
+0-55llvm/test/Verifier/inline-history-metadata.ll
+26-25llvm/lib/Transforms/Utils/CloneFunction.cpp
+17-19llvm/include/llvm/Transforms/Utils/Cloning.h
+107-28313 files not shown
+213-38619 files

LLVM/project 40d3949llvm/tools/llvm-cas-fuzzer cas-fuzzer.cpp DummyCASFuzzer.cpp

[CAS] Add llvm-cas-fuzzer for ObjectStore::validate() (#190635)

Add a fuzzer that creates an on-disk CAS database, stores objects, then
corrupts the on-disk data files using fuzzer-provided bytes and calls
validate(). The goal is that validate() should either succeed or return
an error, never crash.

The fuzzer supports 6 corruption modes: byte-level mutations, file
truncation, appending garbage, zeroing ranges, standalone file
corruption, and combined mutations with continued CAS operations.

Assisted-By: Claude
DeltaFile
+387-0llvm/tools/llvm-cas-fuzzer/cas-fuzzer.cpp
+14-0llvm/tools/llvm-cas-fuzzer/DummyCASFuzzer.cpp
+10-0llvm/tools/llvm-cas-fuzzer/CMakeLists.txt
+411-03 files

LLVM/project 950f1delldb/include/lldb/Utility UUID.h

[lldb] Fix UUID thombstone Key (#190551)

This changes `DenseMapInfo<UUID>::getTombstoneKey()` to return a 1-byte
`{0xFF}` sentinel instead of the empty, default constructed UUID().
Returning the same key for the empty and tombstone value apparently
violates the `DenseMap` invariant.
DeltaFile
+4-1lldb/include/lldb/Utility/UUID.h
+4-11 files

LLVM/project 9d0544dllvm/include/llvm/Analysis BlockFrequencyInfoImpl.h

[spr] initial version

Created using spr 1.3.8-wip
DeltaFile
+59-109llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
+59-1091 files

LLVM/project 9f6ac5allvm/include/llvm/Transforms/Utils Cloning.h, llvm/lib/Transforms/IPO Inliner.cpp

Revert "[Inliner] Put inline history into IR as !inline_history metadata (#19…"

This reverts commit 72d4ce9889a0bae9645de1a07cb051d0205cb964.
DeltaFile
+0-102llvm/test/Transforms/Inline/inline-history.ll
+28-57llvm/lib/Transforms/Utils/InlineFunction.cpp
+36-25llvm/lib/Transforms/IPO/Inliner.cpp
+0-55llvm/test/Verifier/inline-history-metadata.ll
+26-25llvm/lib/Transforms/Utils/CloneFunction.cpp
+17-19llvm/include/llvm/Transforms/Utils/Cloning.h
+107-28313 files not shown
+213-38619 files

LLVM/project f1cdb8cllvm/lib/Target/AMDGPU SIMemoryLegalizer.cpp

[NFC][AMDGPU] Add some debug prints to SIMemoryLegalizer
DeltaFile
+69-0llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
+69-01 files

LLVM/project 2aa4100compiler-rt/cmake/Modules AllSupportedArchDefs.cmake

[compiler-rt] Add hexagon to libFuzzer supported architectures (#190297)

LibFuzzer builds successfully for Hexagon Linux.
DeltaFile
+1-1compiler-rt/cmake/Modules/AllSupportedArchDefs.cmake
+1-11 files

LLVM/project 40d5a7dllvm/lib/Target/AMDGPU AMDGPUSearchableTables.td, llvm/test/Analysis/UniformityAnalysis/AMDGPU intrinsics.ll

[AMDGPU][UniformityAnalysis] Mark set_inactive and set_inactive_chain_arg as SourceOfDivergence (#190640)

`set_inactive` produces a result that varies per-lane based on the EXEC mask, even when both inputs are uniform.
DeltaFile
+8-6llvm/test/CodeGen/AMDGPU/fix-wwm-vgpr-copy.ll
+14-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll
+2-0llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
+24-63 files

LLVM/project ba91009llvm/lib/Transforms/InstCombine InstCombineShifts.cpp, llvm/lib/Transforms/Vectorize/SandboxVectorizer/Passes LoadStoreVec.cpp

Address review feedback

Created using spr 1.3.7
DeltaFile
+440-640llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
+396-336llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-extract-vector-elt.mir
+0-311llvm/test/Transforms/InstCombine/icmp-shl-add-to-add.ll
+294-0llvm/test/MC/AMDGPU/vop3-literal-gfx1250.s
+41-111llvm/lib/Transforms/InstCombine/InstCombineShifts.cpp
+70-38llvm/lib/Transforms/Vectorize/SandboxVectorizer/Passes/LoadStoreVec.cpp
+1,241-1,436107 files not shown
+2,973-2,044113 files

LLVM/project 326593bclang/include/clang/Serialization ModuleCache.h, clang/lib/DependencyScanning InProcessModuleCache.cpp

[Support][Modules] Removed prepareForGetLock and its usages. Ensured parent directory exists when creating lock file. (#189888)

Following #187372
DeltaFile
+26-6llvm/lib/Support/LockFileManager.cpp
+28-0llvm/unittests/Support/LockFileManagerTest.cpp
+0-10clang/lib/Serialization/ModuleCache.cpp
+0-4clang/include/clang/Serialization/ModuleCache.h
+0-2clang/lib/DependencyScanning/InProcessModuleCache.cpp
+0-1clang/lib/Frontend/CompilerInstance.cpp
+54-231 files not shown
+54-247 files