LLVM/project ccef34dllvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanTransforms.h, llvm/test/Transforms/LoopVectorize runtime-check-known-true.ll

[VPlan] Simplify reverse(reverse(x)) -> x (#199057)

This is a version of #196900 that performs the simplification as a
separate transform.

We need to add an additional `vp.splice.right(vp.splice.left(poison, x,
evl), poison, evl) -> x` simplification to avoid left over splices
whenever reverses are removed in an EVL tail folded loop.

Co-authored-by: Madhur Amilkanthwar <madhura at nvidia.com>
DeltaFile
+59-0llvm/test/Transforms/LoopVectorize/VPlan/simplify-reverse-reverse.ll
+26-6llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+4-12llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reverse-load-store.ll
+3-5llvm/test/Transforms/LoopVectorize/runtime-check-known-true.ll
+3-0llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+1-0llvm/test/Transforms/LoopVectorize/VPlan/vplan-print-after-all.ll
+96-236 files

LLVM/project 0a04c14llvm/tools/llvm-objdump llvm-objdump.cpp, llvm/tools/llvm-profdata llvm-profdata.cpp

[llvm] Replace unordered_{map,set} with Dense{Map,Set} in llvm tools (#204058)

std::unordered_map is slow. Switch the remaining local maps and sets in
the command-line tools (llvm-profgen, llvm-profdata, llvm-objdump,
llvm-exegesis, llvm-xray, llvm-remarkutil) to DenseMap/DenseSet.
DeltaFile
+27-28llvm/tools/llvm-profgen/PerfReader.h
+9-19llvm/tools/llvm-profgen/MissingFrameInferrer.h
+12-15llvm/tools/llvm-objdump/llvm-objdump.cpp
+14-10llvm/tools/llvm-profgen/ProfiledBinary.h
+7-10llvm/tools/llvm-profgen/PerfReader.cpp
+7-9llvm/tools/llvm-profdata/llvm-profdata.cpp
+76-919 files not shown
+101-11715 files

LLVM/project 639b1d9lld/ELF/Arch LoongArch.cpp, lld/test/ELF loongarch-pcadd-hi20.s

[lld][LoongArch] Fix range checking of R_LARCH_*_PCADD_HI20 relocations on 64-bit (#183233)

According to the la-abi-specs, the `R_LARCH_*_PCADD_HI20` relocations
are also used on 64-bit LoongArch. Fix the range checking accordingly.
DeltaFile
+32-0lld/test/ELF/loongarch-pcadd-hi20.s
+1-1lld/ELF/Arch/LoongArch.cpp
+33-12 files

LLVM/project 4f8a1e9lld/test/ELF loongarch-pcadd-hi20.s

Add test case
DeltaFile
+32-0lld/test/ELF/loongarch-pcadd-hi20.s
+32-01 files

LLVM/project d1539edlld/ELF/Arch LoongArch.cpp

[lld][LoongArch] Fix range checking of R_LARCH_*_PCADD_HI20 relocations on 64-bit

According to the la-abi-specs, the R_LARCH_*_PCADD_HI20 relocations are
also used on 64-bit LoongArch. Fix the range checking accordingly.
DeltaFile
+1-1lld/ELF/Arch/LoongArch.cpp
+1-11 files

LLVM/project ec802a7libc/hdr/types struct_winsize.h CMakeLists.txt, libc/include/llvm-libc-types struct_winsize.h

[libc] Add TIOCGWINSZ and struct winsize support (#203919)

Added support for the TIOCGWINSZ ioctl command.

* Defined struct winsize in llvm-libc-types.
* Defined TIOCGWINSZ in linux/sys-ioctl-macros.h.
* Exposed struct_winsize and TIOCGWINSZ in sys/ioctl.yaml.
* Added struct_winsize proxy header in hdr/types/struct_winsize.h.
* Added a unit test in test/src/sys/ioctl/linux/ioctl_test.cpp.

Assisted-by: Automated tooling, human reviewed.
DeltaFile
+24-0libc/include/llvm-libc-types/struct_winsize.h
+23-0libc/hdr/types/struct_winsize.h
+19-0libc/test/src/sys/ioctl/linux/ioctl_test.cpp
+8-0libc/hdr/types/CMakeLists.txt
+4-1libc/include/sys/ioctl.yaml
+1-0libc/test/src/sys/ioctl/linux/CMakeLists.txt
+79-13 files not shown
+82-19 files

LLVM/project 18cc9aamlir/include/mlir/Conversion/EmitCCommon TypeConverter.h, mlir/lib/Conversion/EmitCCommon TypeConverter.cpp CMakeLists.txt

[mlir][emitc] Add a common type converter (#203763)

MemRef type conversion is currently implemented as part of the memref
dialect lowering pass, which means e.g. that func-to-emitc cannot lower
functions taking MemRef types as arguments.

This patch refactors the existing type conversions in EmitC's lowering
passes into a structure similar to the LLVM dialect by adding a common
EmitC type converter and using it across dialect-specfic EmitC lowering
passes and the generic convert-to-emitc pass.

Assisted-by: Copilot
DeltaFile
+58-0mlir/lib/Conversion/EmitCCommon/TypeConverter.cpp
+31-0mlir/include/mlir/Conversion/EmitCCommon/TypeConverter.h
+0-29mlir/lib/Conversion/MemRefToEmitC/MemRefToEmitC.cpp
+14-0mlir/test/Conversion/FuncToEmitC/func-to-emitc.mlir
+11-0mlir/lib/Conversion/EmitCCommon/CMakeLists.txt
+2-9mlir/lib/Conversion/MemRefToEmitC/MemRefToEmitCPass.cpp
+116-3812 files not shown
+133-6918 files

LLVM/project 089d063clang/lib/Format CMakeLists.txt

[clang-format][NFC] Don't always rebuild clang-format-check-format (#203828)

Instead, check the format of clan-format source only if the built
clang-format binary or one of the source files is newer.
DeltaFile
+4-1clang/lib/Format/CMakeLists.txt
+4-11 files

LLVM/project 08834adclang/lib/Format UnwrappedLineFormatter.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] Fix a bug in merging short inline functions (#203754)

Fixes #203209
DeltaFile
+13-13clang/lib/Format/UnwrappedLineFormatter.cpp
+9-0clang/unittests/Format/FormatTest.cpp
+22-132 files

LLVM/project 8cd2329llvm/lib/Target/RISCV/AsmParser RISCVAsmParser.cpp

drop unnecessary function

Created using spr 1.3.8-beta.1
DeltaFile
+0-1llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+0-11 files

LLVM/project 6548844llvm/lib/IR PrintPasses.cpp, llvm/lib/Transforms/IPO SampleProfileProbe.cpp

[llvm] Replace unordered_set<std::string> with StringSet (#204048)

std::unordered_set<std::string> without a pointer-stability requirement
can use StringSet: it avoids per-TU hashtable instantiations and the
std::string temporary at StringRef lookup sites (~3-4 KiB smaller .text
for llc/opt).
DeltaFile
+5-8llvm/lib/IR/PrintPasses.cpp
+3-5llvm/tools/llvm-profgen/ProfiledBinary.cpp
+4-4llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+2-3llvm/tools/llvm-readobj/ObjDumper.h
+2-2llvm/tools/llvm-config/llvm-config.cpp
+2-2llvm/tools/llvm-profgen/PerfReader.cpp
+18-244 files not shown
+25-3010 files

LLVM/project fa5d8f8lld/ELF Writer.cpp, lld/docs ReleaseNotes.rst

[ELF] Support multiple PT_GNU_RELRO when SECTIONS is used without PHDRS (#203675)

When a SECTIONS command interleaves relro and non-relro sections, the
relro
region is split into discontiguous runs. lld emits an error since
https://reviews.llvm.org/D40359

    error: section: <name> is not contiguous with other relro sections

This is overly strict: while glibc only honors the first PT_GNU_RELRO,
other loaders (e.g. Bionic and FreeBSD rtld-elf) protect every
PT_GNU_RELRO segment.

Emit one PT_GNU_RELRO segment for each contiguous run of relro sections.
Track the boundary section so that `createPhdrs` starts a fresh PT_LOAD
at each relro->non-relro transition, as before.

Consumers that don't expect multiple PT_GNU_RELRO should check the
output themselves.
DeltaFile
+93-23lld/test/ELF/relro-non-contiguous.s
+34-20lld/test/ELF/relro-non-contiguous-script-data.s
+21-20lld/ELF/Writer.cpp
+12-2lld/test/ELF/keep-data-section-prefix.s
+10-2lld/test/ELF/linkerscript/data-segment-relro.test
+4-0lld/docs/ReleaseNotes.rst
+174-676 files

LLVM/project 96e45c5llvm/include/llvm/ADT SCCIterator.h, llvm/include/llvm/DebugInfo/LogicalView/Readers LVDWARFReader.h

[llvm] Replace unordered_set<T *> with SmallPtrSet<T *, 0> (#204051)

std::unordered_set is slow. For pointer sets without a pointer-stability
or iterator-stability requirement, use SmallPtrSet<T *, 0> for a smaller
code size.
DeltaFile
+6-4llvm/tools/llvm-profgen/ProfiledBinary.h
+5-5llvm/lib/Passes/StandardInstrumentations.cpp
+5-4llvm/tools/llvm-profgen/ProfileGenerator.h
+4-4llvm/tools/llvm-profgen/ProfileGenerator.cpp
+2-2llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
+2-2llvm/include/llvm/ADT/SCCIterator.h
+24-212 files not shown
+28-258 files

LLVM/project 2cb8b61llvm/include/llvm/Analysis TargetTransformInfoImpl.h

[TTI] Add missing no-cost coroutine intrinsics (#203816)

These intrinsics are lowered in the CoroCleanup pass and don't represent
actual code. This patch adds them to the no-cost list so they do not
contribute to the cost of inlining and optimization.
DeltaFile
+6-0llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+6-01 files

LLVM/project 1f3d3d6flang/include/flang/Semantics symbol.h, flang/lib/Lower/OpenMP OpenMP.cpp

[flang][mlir] Add flang to mlir lowering for groupprivate (#180934)

This PR implements the Flang frontend lowering for the OpenMP
`groupprivate`
Changes:
- Update genOMP handler for OpenMPGroupprivate in OpenMP.cpp to generate
`omp.groupprivate` MLIR operation.
- Add clause processing for groupprivate variables
- Add test cases for groupprivate lowering

Co-Authored-By: Claude
[noreply at anthropic.com](mailto:noreply at anthropic.com)
DeltaFile
+276-0flang/test/Lower/OpenMP/groupprivate.f90
+146-1flang/lib/Lower/OpenMP/OpenMP.cpp
+36-0flang/test/Lower/OpenMP/groupprivate-modfile.f90
+32-1flang/lib/Semantics/resolve-directives.cpp
+23-4flang/include/flang/Semantics/symbol.h
+18-5flang/lib/Semantics/symbol.cpp
+531-113 files not shown
+551-229 files

LLVM/project dab3476llvm/test/CodeGen/RISCV/rvv fixed-vectors-masked-gather.ll fixed-vectors-vpgather.ll

[RISCV] Consider known leading zeros in narrowIndex for gather/scatter. (#203970)

If there are enough leading zeros for the shift amount, then
we can do the shift in the narrow type.
DeltaFile
+143-0llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
+26-0llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpgather.ll
+24-0llvm/test/CodeGen/RISCV/rvv/vpgather-sdnode.ll
+22-0llvm/test/CodeGen/RISCV/rvv/vpscatter-sdnode.ll
+15-0llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll
+14-0llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll
+244-01 files not shown
+250-07 files

LLVM/project d22a0ecllvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp GCNHazardRecognizer.h, llvm/test/CodeGen/AMDGPU misched-into-wmma-hazard-shadow.mir

[AMDGPU] Track VALU instructions separately for WMMA coexecution hazards

WMMA coexecution hazards can only be resolved by VALU instructions, not
S_NOPs. Track VALU/WMMA instructions separately so the scheduler can
accurately determine stall cycles.
DeltaFile
+59-10llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+36-0llvm/test/CodeGen/AMDGPU/misched-into-wmma-hazard-shadow.mir
+16-0llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+111-103 files

LLVM/project 1503720llvm/lib/Target/AMDGPU SIPreEmitPeephole.cpp, llvm/test/CodeGen/AMDGPU wmma-set-reuse-bits.mir

[AMDGPU] Set WMMA source-operand reuse bits in SIPreEmitPeephole

gfx1250 WMMA instructions can set matrix_a_reuse / matrix_b_reuse bits
that keep the A or B source operand in a high-temporality state in the
VALU source-operand cache, so a later WMMA reusing the same registers
hits in the cache instead of re-reading the register file.

Add a late, post-RA peephole in the existing pre-emit peephole pass that
scans each basic block and, for every WMMA, sets the A/B reuse bit when
one of the next few WMMAs reuses the same physical registers as its A or B
operand and those registers are not redefined in between.

Stale sticky entries in the cache are cleared when a register is used in
an instruction without a reuse bit being set. Therefore, the final WMMA
use of the same source should not set the bit.
DeltaFile
+105-0llvm/test/CodeGen/AMDGPU/wmma-set-reuse-bits.mir
+95-0llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
+200-02 files

LLVM/project 04f075fllvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/CodeGen/AMDGPU/NextUseAnalysis spill-vreg-many-lanes.mir acyclic-770bb.mir

Merge remote-tracking branch 'origin/main' into users/hev/fix-lld-pcadd-check
DeltaFile
+275,101-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/spill-vreg-many-lanes.mir
+92,890-85,927llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+144,679-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/acyclic-770bb.mir
+44,396-53,126llvm/test/CodeGen/RISCV/rvv/clmulh-sdnode.ll
+57,682-0llvm/test/CodeGen/AMDGPU/NextUseAnalysis/double-nested-loops-complex-cfg.mir
+28,845-27,920llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+643,593-166,97337,207 files not shown
+4,536,912-1,842,23337,213 files

LLVM/project 17db41cllvm/lib/Target/RISCV RISCVInstrInfoZvdot4a8i.td

[RISCV] Add scheduling data to vdot4aus.vx (#204038)

Remove unnecessary mayLoad, mayStore, hasSideEffects
DeltaFile
+12-9llvm/lib/Target/RISCV/RISCVInstrInfoZvdot4a8i.td
+12-91 files

LLVM/project 7a0e8d6llvm/include/llvm/Transforms/Coroutines CoroInstr.h, llvm/lib/Transforms/IPO ThinLTOBitcodeWriter.cpp WholeProgramDevirt.cpp

cfi creates alias
DeltaFile
+20-33llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
+20-5llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+10-7llvm/test/Transforms/ThinLTOBitcodeWriter/comdat.ll
+16-0llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+13-0llvm/test/Transforms/Coroutines/coro-id-alias.ll
+8-4llvm/include/llvm/Transforms/Coroutines/CoroInstr.h
+87-498 files not shown
+116-6914 files

LLVM/project 6dc97cfllvm/include/llvm/IR GlobalValue.h, llvm/include/llvm/Transforms/Utils AssignGUID.h

Reland #184065
DeltaFile
+61-17llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+45-30llvm/lib/LTO/LTO.cpp
+64-2llvm/lib/IR/Globals.cpp
+49-3llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
+45-5llvm/include/llvm/IR/GlobalValue.h
+49-0llvm/include/llvm/Transforms/Utils/AssignGUID.h
+313-57116 files not shown
+848-401122 files

LLVM/project 4281db2llvm/lib/Target/SPIRV SPIRVEmitIntrinsics.cpp SPIRVModuleAnalysis.cpp, llvm/lib/Target/SPIRV/Analysis SPIRVConvergenceRegionAnalysis.cpp

[SPIRV] Replace unordered_{map,set} with DenseMap/SmallPtrSet (#204046)

Extracted from #202222
DeltaFile
+39-36llvm/lib/Target/SPIRV/SPIRVEmitIntrinsics.cpp
+18-8llvm/lib/Target/SPIRV/SPIRVModuleAnalysis.cpp
+7-7llvm/lib/Target/SPIRV/SPIRVUtils.h
+5-6llvm/lib/Target/SPIRV/SPIRVStructurizer.cpp
+4-4llvm/lib/Target/SPIRV/SPIRVUtils.cpp
+3-3llvm/lib/Target/SPIRV/Analysis/SPIRVConvergenceRegionAnalysis.cpp
+76-643 files not shown
+79-679 files

LLVM/project 1eae7efllvm/test/CodeGen/AMDGPU wmma-hazards-gfx1250-w32.mir wmma-coexecution-valu-hazards.mir

[AMDGPU] Update f8f6f4-wmma hazard tests regarding matrix format, NFC (#204037)

Need to map the matrix format suffix to the register size correctly in
the MIR tests. For example, 'F4' format needs v8i32 register class.
DeltaFile
+185-184llvm/test/CodeGen/AMDGPU/wmma-hazards-gfx1250-w32.mir
+78-78llvm/test/CodeGen/AMDGPU/wmma-coexecution-valu-hazards.mir
+263-2622 files

LLVM/project 78d0ee8llvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanConstruction.cpp

[VPlan] Add VPlan::getPoison helper. NFC (#203937)

We do this in a few places, so this adds a helper similar to
getTrue/getConstantInt and friends.
DeltaFile
+3-6llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+2-3llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+5-0llvm/lib/Transforms/Vectorize/VPlan.h
+2-2llvm/lib/Transforms/Vectorize/VPlanUnroll.cpp
+1-2llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+13-135 files

LLVM/project 461b1b6clang/test/Sema warn-lifetime-safety.cpp, clang/test/Sema/LifetimeSafety safety.cpp

review feedback

Created using spr 1.3.8-beta.1
DeltaFile
+3,204-3,450llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+6,583-0llvm/lib/Target/AArch64/AArch64PerfectShuffle.cpp
+3-6,571llvm/lib/Target/AArch64/AArch64PerfectShuffle.h
+1,905-2,037llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+3,721-0clang/test/Sema/LifetimeSafety/safety.cpp
+0-3,653clang/test/Sema/warn-lifetime-safety.cpp
+15,416-15,7111,615 files not shown
+106,577-49,1851,621 files

LLVM/project 10cb7d3clang/test/Sema warn-lifetime-safety.cpp, clang/test/Sema/LifetimeSafety safety.cpp

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+3,204-3,450llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-7.ll
+6,583-0llvm/lib/Target/AArch64/AArch64PerfectShuffle.cpp
+3-6,571llvm/lib/Target/AArch64/AArch64PerfectShuffle.h
+1,905-2,037llvm/test/CodeGen/X86/vector-interleaved-store-i16-stride-6.ll
+3,721-0clang/test/Sema/LifetimeSafety/safety.cpp
+0-3,653clang/test/Sema/warn-lifetime-safety.cpp
+15,416-15,7111,612 files not shown
+106,527-49,1341,618 files

LLVM/project 6440862llvm/lib/Target/AMDGPU SIInstrInfo.h

Fix comment

Change-Id: I6d08b2c5189cb760ad7eda7f1b4a0ca9467525fb
DeltaFile
+3-4llvm/lib/Target/AMDGPU/SIInstrInfo.h
+3-41 files

LLVM/project 8446b04llvm/lib/Target/AMDGPU SIInstrInfo.h

Document the new parameter

Change-Id: Iff72a66f46b00f838c86931e7bfc3a026d985da0
DeltaFile
+5-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+5-01 files

LLVM/project 8d61da2lldb/test/API/macosx/riscv32-corefile TestRV32MachOCorefile.py

[lldb] Only run rv32 corefile test if rv32 llvm target enabled (#204040)
DeltaFile
+1-1lldb/test/API/macosx/riscv32-corefile/TestRV32MachOCorefile.py
+1-11 files