LLVM/project 5eb66f4openmp/runtime/unittests/String TestKmpStr.cpp

enhance TestKmpStr.cpp
DeltaFile
+92-0openmp/runtime/unittests/String/TestKmpStr.cpp
+92-01 files

LLVM/project 9cae8ecmlir/lib/Dialect/SparseTensor/Transforms/Utils SparseTensorDescriptor.cpp

[MLIR] Apply clang-tidy fixes for bugprone-argument-comment in SparseTensorDescriptor.cpp (NFC)
DeltaFile
+2-2mlir/lib/Dialect/SparseTensor/Transforms/Utils/SparseTensorDescriptor.cpp
+2-21 files

LLVM/project 8cc9c69mlir/lib/Conversion/AMDGPUToROCDL AMDGPUToROCDL.cpp

[MLIR] Fix clang-tidy fixes for llvm-prefer-isa-or-dyn-cast-in-conditionals in AMDGPUToROCDL.cpp (NFC)

The cast can't fail, the `if` checks are spurious.
DeltaFile
+20-20mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+20-201 files

LLVM/project f719e11mlir/lib/Dialect/SparseTensor/IR SparseTensorDialect.cpp

[MLIR] Apply clang-tidy fixes for misc-use-internal-linkage in SparseTensorDialect.cpp (NFC)
DeltaFile
+1-1mlir/lib/Dialect/SparseTensor/IR/SparseTensorDialect.cpp
+1-11 files

LLVM/project 96ec2d3llvm/test/Transforms/JumpThreading select.ll

update llvm/test/Transforms/JumpThreading/select.ll
DeltaFile
+1-1llvm/test/Transforms/JumpThreading/select.ll
+1-11 files

LLVM/project c386d6dclang/test/Sema warn-thread-safety-analysis.c, clang/test/SemaCXX warn-thread-safety-analysis.cpp

Thread Safety Analysis: Add more cast pointer-alias tests (#172638)

Add 2 tests for cast pointer aliases, with the cleanup pattern being a
real pattern that is considered to be used in the Linux kernel:
https://lore.kernel.org/all/aUGBff8Oko5O8EsP@elver.google.com/

This works today, but let's test it to make sure there are no
regressions.

NFC.
DeltaFile
+9-0clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+8-0clang/test/Sema/warn-thread-safety-analysis.c
+17-02 files

LLVM/project e4c4498lldb/include/lldb/DataFormatters FormattersContainer.h, lldb/include/lldb/Utility StringLexer.h

[lldb][ObjC][NFCI] Replace StringLexer with llvm::StringRef (#172466)

We had a dedicated `StringLexer` class that pretty much just replicates
the `llvm::StringRef` interface. This patch removes the `StringLexer` in
favour of just using `llvm::StringRef`. Much of the string parsing can
be cleaned up, but I tried to keep the changes a small as possible so
just kept the logic and replaced the APIs calls. The only awkward
side-effect of this is that we have to pass a `llvm::StringRef &`
around.

There were some gaps in the API, so added two helper methods to
consume/pop off characters from the front of the StringRef (maybe those
can be added to `llvm::StringRef` in the future).

The `StringLexer` also had a roll-back mechanism used in two places.
That can be handled by just saving a copy of the `StringRef`.
DeltaFile
+0-140lldb/unittests/Utility/StringLexerTest.cpp
+67-42lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCTypeEncodingParser.cpp
+0-85lldb/source/Utility/StringLexer.cpp
+0-56lldb/include/lldb/Utility/StringLexer.h
+12-13lldb/source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCTypeEncodingParser.h
+7-11lldb/include/lldb/DataFormatters/FormattersContainer.h
+86-3472 files not shown
+86-3498 files

LLVM/project 886d2a0libc/include/llvm-libc-macros netinet-in-macros.h, libc/test/include netinet_in_test.cpp

[libc] Add `IN6_IS_ADDR_V4COMPAT`
DeltaFile
+11-0libc/test/include/netinet_in_test.cpp
+6-0libc/include/llvm-libc-macros/netinet-in-macros.h
+17-02 files

LLVM/project 0a03867libc/include/llvm-libc-macros netinet-in-macros.h, libc/test/include netinet_in_test.cpp

[libc] Add `IN6_IS_ADDR_V4MAPPED`
DeltaFile
+8-0libc/include/llvm-libc-macros/netinet-in-macros.h
+7-0libc/test/include/netinet_in_test.cpp
+15-02 files

LLVM/project 27557d0libc/include/llvm-libc-macros netinet-in-macros.h, libc/test/include netinet_in_test.cpp

[libc] Add `IN6_IS_ADDR_MC*`
DeltaFile
+20-0libc/include/llvm-libc-macros/netinet-in-macros.h
+14-0libc/test/include/netinet_in_test.cpp
+34-02 files

LLVM/project 28d4e33llvm/test/CodeGen/AMDGPU global-atomicrmw-fadd.ll waitcnt-func-global-inv.mir

[AMDGPU][SIInsertWaitCnt] Optimize loadcnt insertion at function boundaries (#169647)

On GFX12+, GLOBAL_INV increments the loadcnt counter but does not write
results to any VGPRs. Previously, we unconditionally inserted
s_wait_loadcnt 0 at function returns even when the only pending loadcnt
was from GLOBAL_INV instructions.

This patch optimizes waitcnt insertion by skipping the loadcnt wait at
function boundaries when no VGPRs have pending loads. This is determined
by checking if any VGPR has a score greater than the lower bound for
LOAD_CNT - if not, the pending loadcnt must be from non-VGPR-writing
instructions like GLOBAL_INV.

The optimization is limited to GFX12+ targets where GLOBAL_INV exists
and uses the extended wait count instructions.

This is a follow-up optimization to PR #135340 which added tracking for
GLOBAL_INV in the waitcnt pass.
DeltaFile
+0-224llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
+115-0llvm/test/CodeGen/AMDGPU/waitcnt-func-global-inv.mir
+0-100llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fadd.ll
+0-90llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll
+0-90llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll
+0-90llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
+115-59430 files not shown
+157-1,40536 files

LLVM/project 06e4728polly/lib/Exchange JSONExporter.cpp, polly/test/CodeGen/OpenMP new_multidim_access.ll

[Polly] Recalculate dependencies after import-jscop (#172640)

The new access functions may have different dependencies than the
original ones. Invalidate the dependency analysis after an jscop-import.
DeltaFile
+8-3polly/test/CodeGen/OpenMP/new_multidim_access.ll
+3-0polly/lib/Exchange/JSONExporter.cpp
+11-32 files

LLVM/project a2715f0lld/test/MachO arm64-32-stubs.s arm64-stubs.s, llvm/lib/Target/AArch64/Disassembler AArch64ExternalSymbolizer.cpp

[AArch64][llvm-objdump] Fix arm64_32 symbolization (#171164)

llvm-objdump was missing "literal pool symbol address" comments for
arm64_32 stub disassembly. Fixed by adding 32-bit instruction support
(LDRWui, ADDWri, LDRWl) to AArch64ExternalSymbolizer and aarch64_32
architecture checks to MachODump.cpp symbolization code.

Fixes #49288
DeltaFile
+0-60lld/test/MachO/arm64-32-stubs.s
+25-6llvm/lib/Target/AArch64/Disassembler/AArch64ExternalSymbolizer.cpp
+23-0llvm/test/tools/llvm-objdump/MachO/AArch64/macho-symbolized-disassembly-arm64_32.test
+12-7llvm/tools/llvm-objdump/MachODump.cpp
+14-4lld/test/MachO/arm64-stubs.s
+0-0llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/symbolized-stubs.exe.macho-arm64_32
+74-776 files

LLVM/project 04751b4llvm/test/tools/llvm-objdump/MachO/AArch64 arm64_32-reloc-addend.test arm64_32-bad-opcodes.test, llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs reloc-addend.obj.macho-arm64_32 bad-opcodes.obj.macho-arm64_32

[AArch64][llvm-objdump] Add missing arm64_32 architecture checks (#171638)

Use the same code paths as arm64 since arm64_32 has the same instruction
encoding and register usage.
DeltaFile
+6-6llvm/tools/llvm-objdump/MachODump.cpp
+7-0llvm/test/tools/llvm-objdump/MachO/AArch64/arm64_32-reloc-addend.test
+7-0llvm/test/tools/llvm-objdump/MachO/AArch64/arm64_32-bad-opcodes.test
+0-0llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/reloc-addend.obj.macho-arm64_32
+0-0llvm/test/tools/llvm-objdump/MachO/AArch64/Inputs/bad-opcodes.obj.macho-arm64_32
+20-65 files

LLVM/project ce553abllvm/include/llvm/Support AMDGPUWaitcnt.h, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.h

Revert "[mlir][amdgpu] Expose waitcnt bitpacking infra (#172313)" (#172636)

This reverts commit 93013817afabe23a07073528481856b3507b6faf.

Revert https://github.com/llvm/llvm-project/pull/172313

Missing libraries, again
DeltaFile
+0-207llvm/include/llvm/Support/AMDGPUWaitcnt.h
+180-1llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+42-11mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+222-2193 files

LLVM/project 6390373llvm/test/CodeGen/X86 combine-fmax.ll combine-fmin.ll

[X86] Add tests showing failure to concat matching fmin/fmax vector ops (#172635)

DeltaFile
+47-0llvm/test/CodeGen/X86/combine-fmax.ll
+47-0llvm/test/CodeGen/X86/combine-fmin.ll
+94-02 files

LLVM/project 44a52eallvm/lib/Transforms/InstCombine InstCombineLoadStoreAlloca.cpp, llvm/test/Transforms/InstCombine alloca-phi-non-inst.ll

[InstCombine] Fix unsafe PHINode cast and simplify logic in PointerReplacer (#172332)

Fixes #171883.

Basically, if the operand of the phi is an Instruction but it's not
available, the [condition
](https://github.com/llvm/llvm-project/blame/1847a4efae6b0c0c985cecead1f3ef381a1962de/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L300)would
just break, and when we reach the[ deferral
check](https://github.com/llvm/llvm-project/blame/1847a4efae6b0c0c985cecead1f3ef381a1962de/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L313),
execution would continue even though there is a non-Instruction operand,
leading to a crash in the [subsequent processing
loop](https://github.com/llvm/llvm-project/blame/1847a4efae6b0c0c985cecead1f3ef381a1962de/llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp#L320).
DeltaFile
+40-0llvm/test/Transforms/InstCombine/alloca-phi-non-inst.ll
+3-4llvm/lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
+43-42 files

LLVM/project 921a647clang/lib/CodeGen CGObjCGNU.cpp

[CGObjCGNU] Use getSigned() for instanceSize

For non-fragile this is a negative value.
DeltaFile
+1-1clang/lib/CodeGen/CGObjCGNU.cpp
+1-11 files

LLVM/project dea9ec8llvm/lib/Transforms/Scalar StraightLineStrengthReduce.cpp

[SLSR] Allow implicit truncation for element size

Ideally we'd reject too large types in the IR verifier, but for now
we should follow the usual sext-or-trunc GEP semantics here.
DeltaFile
+3-1llvm/lib/Transforms/Scalar/StraightLineStrengthReduce.cpp
+3-11 files

LLVM/project 9fe88e6llvm/lib/IR Instructions.cpp, llvm/test/CodeGen/WinEH wineh-no-demotion.ll

[IR] Update `PHINode::removeIncomingValueIf()` to use the swap strategy like `PHINode::removeIncomingValue()`

As suggested in https://github.com/llvm/llvm-project/pull/171963, update
`PHINode::removeIncomingValueIf()` to use the swap strategy too.
DeltaFile
+13-17llvm/lib/IR/Instructions.cpp
+12-12llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
+6-6llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
+1-1llvm/test/CodeGen/WinEH/wineh-no-demotion.ll
+1-1llvm/test/Transforms/PhaseOrdering/AArch64/hoist-load-from-vector-loop.ll
+33-375 files

LLVM/project 3186ca2clang/lib/CodeGen/TargetBuiltins ARM.cpp

[ARM] Use getSigned() for signed value
DeltaFile
+1-1clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1-11 files

LLVM/project 30ce1e9clang/lib/CodeGen CGExprScalar.cpp

[CGExprScalar] Allow implicit truncation for CharacterLiteral

The value is always stored as an unsigned number, even if the
char type is signed, so we have to allow truncation here.
DeltaFile
+4-1clang/lib/CodeGen/CGExprScalar.cpp
+4-11 files

LLVM/project 744552dclang/lib/CodeGen ABIInfoImpl.cpp MicrosoftCXXABI.cpp, clang/lib/CodeGen/Targets AArch64.cpp

[CodeGen] Use getSigned() for negative values
DeltaFile
+2-1clang/lib/CodeGen/ABIInfoImpl.cpp
+1-1clang/lib/CodeGen/MicrosoftCXXABI.cpp
+1-1clang/lib/CodeGen/ItaniumCXXABI.cpp
+1-1clang/lib/CodeGen/Targets/AArch64.cpp
+5-44 files

LLVM/project 857748dclang/lib/CodeGen PatternInit.cpp

[PatternInit] Explicitly allow implicit truncation

It's okay if the pattern value gets truncated here, it's a splat.
DeltaFile
+4-2clang/lib/CodeGen/PatternInit.cpp
+4-21 files

LLVM/project 159f1c0clang/test/Headers __clang_hip_math.hip, llvm/test/Transforms/DFAJumpThreading dfa-unfold-select.ll

[IR] Optimize PHINode::removeIncomingValue() by swapping removed incoming value with the last incoming value. (#171963)

Current implementation uses `std::copy` to shift all incoming values
after the removed index. This patch optimizes
`PHINode::removeIncomingValue()` by replacing the linear shift of
incoming values with a swap-with-last strategy.

After this change, the relative order of incoming values after removal
is not preserved.

This improves compile-time for PHI nodes with many predecessors.

Depends:
https://github.com/llvm/llvm-project/pull/171955
https://github.com/llvm/llvm-project/pull/171956
https://github.com/llvm/llvm-project/pull/171960
https://github.com/llvm/llvm-project/pull/171962
DeltaFile
+77-84llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
+22-22clang/test/Headers/__clang_hip_math.hip
+18-18llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll
+12-12llvm/test/Transforms/SimplifyCFG/UnreachableEliminate.ll
+11-11llvm/test/Transforms/PGOProfile/chr.ll
+10-10llvm/test/Transforms/SimplifyCFG/avoid-complex-phi.ll
+150-15767 files not shown
+295-31773 files

LLVM/project 80f3c0dllvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis infer_affine_domain_ovlf.ll

[DA] Introduce OverflowSafeSignedAPInt to prevent potential overflow (#171991)

In Exact SIV and Exact RDIV tests, there are multiple `APInt` arithmetic
operations which can overflow. These overflows are currently not
checked, which may lead to incorrect analysis results. However, adding
overflow checks for each operation can clutter the code and make it
harder to read.
This patch introduces a new wrapper class `OverflowSafeSignedAPInt` that
encapsulates a `std::optional<APInt>` and provides arithmetic operations
with built-in overflow checking. If an arithmetic operation overflows,
the internal `std::optional<APInt>` is set to `std::nullopt`, indicating
an invalid result. Also, if any operand of an arithmetic operation is
invalid, the result will also be invalid. By using this wrapper class in
the Exact SIV and Exact RDIV tests, now overflows are handled properly
while keeping the readability of the code.

Fixes the test added in #171990.
DeltaFile
+146-41llvm/lib/Analysis/DependenceAnalysis.cpp
+1-1llvm/test/Analysis/DependenceAnalysis/infer_affine_domain_ovlf.ll
+147-422 files

LLVM/project 9301381llvm/include/llvm/Support AMDGPUWaitcnt.h, llvm/lib/Target/AMDGPU/Utils AMDGPUBaseInfo.h

[mlir][amdgpu] Expose waitcnt bitpacking infra (#172313)

So we can get rid of our copy in `AMDGPUToROCDL`.
DeltaFile
+207-0llvm/include/llvm/Support/AMDGPUWaitcnt.h
+1-180llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+11-42mlir/lib/Conversion/AMDGPUToROCDL/AMDGPUToROCDL.cpp
+219-2223 files

LLVM/project dfe84fbmlir/include/mlir/Dialect/Linalg Passes.td

[MLIR][NFC] Fix the pass description to describe what it actually does. (#172306)

DeltaFile
+8-1mlir/include/mlir/Dialect/Linalg/Passes.td
+8-11 files

LLVM/project fd31ab9llvm/test/Examples/OrcV2Examples lljit-with-remote-debugging.test

[llvm][examples] Run LLJITWithRemoteDebugging test only on native Linux 64-bit Intel (#172518)

DeltaFile
+1-2llvm/test/Examples/OrcV2Examples/lljit-with-remote-debugging.test
+1-21 files

LLVM/project 5914b43clang/test/Sema warn-lifetime-safety-dataflow.cpp, llvm/include/llvm/Passes CodeGenPassBuilder.h

Merge branch 'main' into users/kasuga-fj/add-overflow-safe-signed-apint
DeltaFile
+6,871-0llvm/test/CodeGen/RISCV/short-forward-branch-opt-load.ll
+3,834-0llvm/test/CodeGen/RISCV/short-forward-branch-opt-load-atomic-acquire-seq_cst.ll
+417-417llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop1_dpp16.txt
+394-394llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop1_dpp16.txt
+310-348llvm/include/llvm/Passes/CodeGenPassBuilder.h
+153-371clang/test/Sema/warn-lifetime-safety-dataflow.cpp
+11,979-1,530402 files not shown
+22,112-4,751408 files