LLVM/project e675b9fllvm/lib/Analysis InstructionSimplify.cpp, llvm/test/Transforms/InstSimplify compare.ll

[InstSimplify] Consider `dereferenceable(N)` when simplifying pointer equalities (#203867)

Extend `computePointerICmp` to leverage `dereferenceable(N)` attribute
when simplifying pointer equality comparisons. Per attribute semantics,
an argument pointer marked as such cannot be a one-past-the-end pointer
to some object, thus it cannot equal the start of an adjacent object.
This lets us prove inequality between a `dereferenceable` argument and
storage allocated within the function.

Fixes: https://github.com/llvm/llvm-project/issues/200511.
DeltaFile
+182-0llvm/test/Transforms/InstSimplify/compare.ll
+47-28llvm/lib/Analysis/InstructionSimplify.cpp
+229-282 files

LLVM/project 2487cb0clang/lib/AST/ByteCode EvaluationResult.cpp EvalEmitter.cpp

[clang][bytecode] Rename checkReturnValue to checkDynamicAllocations (#204064)

This is part of https://github.com/llvm/llvm-project/pull/186045, but
makes sense independently.
DeltaFile
+4-3clang/lib/AST/ByteCode/EvaluationResult.cpp
+2-2clang/lib/AST/ByteCode/EvalEmitter.cpp
+2-2clang/lib/AST/ByteCode/EvaluationResult.h
+8-73 files

LLVM/project fe51b83llvm/include/llvm/Analysis AssumeBundleQueries.h, llvm/lib/Analysis AssumeBundleQueries.cpp

[Test] Remove test creating invalid assume operand bundles (#203945)

This was creating random assume operand bundles, using unsupported
attributes, and using invalid arguments for supported ones.

Rather than trying to salvage this test, delete it and the API it tests.
DeltaFile
+0-98llvm/unittests/Analysis/AssumeBundleQueriesTest.cpp
+0-12llvm/include/llvm/Analysis/AssumeBundleQueries.h
+0-6llvm/lib/Analysis/AssumeBundleQueries.cpp
+0-1163 files

LLVM/project 4ae4a15clang-tools-extra/test/clang-tidy/infrastructure cli-argument-errors.cpp config-option-errors.cpp

Revert "[clang-tidy][NFC] Add more test coverage for tidy errors" (#204073)

Reverts llvm/llvm-project#203987
DeltaFile
+0-13clang-tools-extra/test/clang-tidy/infrastructure/cli-argument-errors.cpp
+0-13clang-tools-extra/test/clang-tidy/infrastructure/config-option-errors.cpp
+0-11clang-tools-extra/test/clang-tidy/infrastructure/vfsoverlay-errors.cpp
+0-10clang-tools-extra/test/clang-tidy/infrastructure/config-file-parse-errors.cpp
+0-8clang-tools-extra/test/clang-tidy/infrastructure/export-fixes-errors.cpp
+0-7clang-tools-extra/test/clang-tidy/infrastructure/list-checks-no-checks.cpp
+0-626 files

LLVM/project 95cc633mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-taskloop-reduction.mlir openmp-todo.mlir

[mlir][OpenMP] Translate reductions on taskloop

Add LLVM IR translation for reduction and in_reduction clauses on omp.taskloop.context.

For taskloop reduction, emit the implicit taskgroup reduction setup and map each generated task to runtime-provided private reduction storage through __kmpc_task_reduction_get_th_data. For in_reduction, use the same runtime lookup path with a null descriptor to join an enclosing task reduction context.

Unsupported byref, cleanup, and two-argument initializer forms remain diagnosed.

Add MLIR translation tests for the supported taskloop reduction and in_reduction cases.
DeltaFile
+373-0mlir/test/Target/LLVMIR/openmp-taskloop-reduction.mlir
+238-27mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+92-10mlir/test/Target/LLVMIR/openmp-todo.mlir
+703-373 files

LLVM/project 1338c5cllvm/test/CodeGen/AArch64/GlobalISel irtranslator-memset-inline.ll inline-memset-forced.mir, llvm/test/CodeGen/AMDGPU/GlobalISel legalize-memsetinline.mir

[GlobalISel] Implement `llvm.memset.inline` (#203198)
DeltaFile
+142-0llvm/test/CodeGen/AArch64/GlobalISel/irtranslator-memset-inline.ll
+77-0llvm/test/CodeGen/AArch64/GlobalISel/inline-memset-forced.mir
+72-0llvm/test/CodeGen/AArch64/GlobalISel/inline-small-memset.mir
+69-0llvm/test/CodeGen/RISCV/GlobalISel/memset-inline.ll
+59-0llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-memsetinline.mir
+57-0llvm/test/CodeGen/Mips/GlobalISel/mips-prelegalizer-combiner/inline-memset.mir
+476-021 files not shown
+614-2727 files

LLVM/project 161d8a7clang-tools-extra/clangd HeaderSourceSwitch.cpp ClangdLSPServer.cpp, clang-tools-extra/clangd/refactor/tweaks ExtractVariable.cpp

[clangd][nfc] Avoid type erasure for local recursive callbacks (#203042)

Four local clangd callbacks use std::function only to call themselves.
Switch to local structs and static functions to avoid std::function
type-erasure and copy-support machinery.

In matched Release AArch64 builds, the four object files shrink by 8,152
bytes and 131 relocations; linked clangd shrinks by 3,872 bytes
unstripped and 16 bytes stripped, with __text down 360 bytes,
__DATA_CONST,__const down 208 bytes, unwind data down 32 bytes, and 21
fewer dyld fixups.

Work towards #202616

AI tool disclosure: Co-authored with OpenAI Codex.
DeltaFile
+22-17clang-tools-extra/clangd/HeaderSourceSwitch.cpp
+22-17clang-tools-extra/clangd/ClangdLSPServer.cpp
+13-11clang-tools-extra/clangd/refactor/tweaks/ExtractVariable.cpp
+11-10clang-tools-extra/clangd/Protocol.cpp
+68-554 files

LLVM/project e13bb91llvm/lib/Target/AArch64 AArch64RegisterInfo.cpp, llvm/unittests/Target/AArch64 AArch64RegisterInfoTest.cpp

[AArch64] Reserve `W30_HI` and `[BHSDQ]31_HI` (#202929)
DeltaFile
+38-0llvm/unittests/Target/AArch64/AArch64RegisterInfoTest.cpp
+6-6llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+44-62 files

LLVM/project 9d7ca44llvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp GCNHazardRecognizer.h, llvm/test/CodeGen/AMDGPU misched-into-wmma-hazard-shadow.mir

[AMDGPU] Track VALU instructions separately for WMMA coexecution hazards (#202523)

WMMA coexecution hazards can only be resolved by VALU instructions, not
S_NOPs. Track VALU/WMMA instructions separately so the scheduler can
accurately determine stall cycles.
DeltaFile
+59-10llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+36-0llvm/test/CodeGen/AMDGPU/misched-into-wmma-hazard-shadow.mir
+16-0llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+111-103 files

LLVM/project 2e36f06clang-tools-extra/test/clang-tidy/infrastructure cli-argument-errors.cpp config-option-errors.cpp

[clang-tidy][NFC] Add more test coverage for tidy errors (#203987)
DeltaFile
+13-0clang-tools-extra/test/clang-tidy/infrastructure/cli-argument-errors.cpp
+13-0clang-tools-extra/test/clang-tidy/infrastructure/config-option-errors.cpp
+11-0clang-tools-extra/test/clang-tidy/infrastructure/vfsoverlay-errors.cpp
+10-0clang-tools-extra/test/clang-tidy/infrastructure/config-file-parse-errors.cpp
+8-0clang-tools-extra/test/clang-tidy/infrastructure/export-fixes-errors.cpp
+7-0clang-tools-extra/test/clang-tidy/infrastructure/list-checks-no-checks.cpp
+62-06 files

LLVM/project f88e9delibc/lib CMakeLists.txt

[libc] Generate a stub for libpthread.a (#200908)

Several build systems / existing scripts assume that pthread functions
are exposed through separate library (`libpthread.so` / `libpthread.a`)
and thus use `-lpthread` flag explicitly. Since llvm-libc puts all the
pthread functions into the regular `libc`, teach the CMake build rules
to produce an empty static archive `libpthread.a` for compatibility
purposes.
DeltaFile
+25-0libc/lib/CMakeLists.txt
+25-01 files

LLVM/project ccef34dllvm/lib/Transforms/Vectorize VPlanTransforms.cpp VPlanTransforms.h, llvm/test/Transforms/LoopVectorize runtime-check-known-true.ll

[VPlan] Simplify reverse(reverse(x)) -> x (#199057)

This is a version of #196900 that performs the simplification as a
separate transform.

We need to add an additional `vp.splice.right(vp.splice.left(poison, x,
evl), poison, evl) -> x` simplification to avoid left over splices
whenever reverses are removed in an EVL tail folded loop.

Co-authored-by: Madhur Amilkanthwar <madhura at nvidia.com>
DeltaFile
+59-0llvm/test/Transforms/LoopVectorize/VPlan/simplify-reverse-reverse.ll
+26-6llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+4-12llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-reverse-load-store.ll
+3-5llvm/test/Transforms/LoopVectorize/runtime-check-known-true.ll
+3-0llvm/lib/Transforms/Vectorize/VPlanTransforms.h
+1-0llvm/test/Transforms/LoopVectorize/VPlan/vplan-print-after-all.ll
+96-236 files

LLVM/project 0a04c14llvm/tools/llvm-objdump llvm-objdump.cpp, llvm/tools/llvm-profdata llvm-profdata.cpp

[llvm] Replace unordered_{map,set} with Dense{Map,Set} in llvm tools (#204058)

std::unordered_map is slow. Switch the remaining local maps and sets in
the command-line tools (llvm-profgen, llvm-profdata, llvm-objdump,
llvm-exegesis, llvm-xray, llvm-remarkutil) to DenseMap/DenseSet.
DeltaFile
+27-28llvm/tools/llvm-profgen/PerfReader.h
+9-19llvm/tools/llvm-profgen/MissingFrameInferrer.h
+12-15llvm/tools/llvm-objdump/llvm-objdump.cpp
+14-10llvm/tools/llvm-profgen/ProfiledBinary.h
+7-10llvm/tools/llvm-profgen/PerfReader.cpp
+7-9llvm/tools/llvm-profdata/llvm-profdata.cpp
+76-919 files not shown
+101-11715 files

LLVM/project 639b1d9lld/ELF/Arch LoongArch.cpp, lld/test/ELF loongarch-pcadd-hi20.s

[lld][LoongArch] Fix range checking of R_LARCH_*_PCADD_HI20 relocations on 64-bit (#183233)

According to the la-abi-specs, the `R_LARCH_*_PCADD_HI20` relocations
are also used on 64-bit LoongArch. Fix the range checking accordingly.
DeltaFile
+32-0lld/test/ELF/loongarch-pcadd-hi20.s
+1-1lld/ELF/Arch/LoongArch.cpp
+33-12 files

LLVM/project 4f8a1e9lld/test/ELF loongarch-pcadd-hi20.s

Add test case
DeltaFile
+32-0lld/test/ELF/loongarch-pcadd-hi20.s
+32-01 files

LLVM/project d1539edlld/ELF/Arch LoongArch.cpp

[lld][LoongArch] Fix range checking of R_LARCH_*_PCADD_HI20 relocations on 64-bit

According to the la-abi-specs, the R_LARCH_*_PCADD_HI20 relocations are
also used on 64-bit LoongArch. Fix the range checking accordingly.
DeltaFile
+1-1lld/ELF/Arch/LoongArch.cpp
+1-11 files

LLVM/project e73e459mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp, mlir/test/Target/LLVMIR openmp-taskgroup-task-reduction.mlir openmp-todo.mlir

[mlir][OpenMP] Translate task_reduction on taskgroup

Add LLVM IR translation support for the task_reduction clause on
omp.taskgroup.

The translation builds task-reduction descriptors for the listed reduction
variables and emits the runtime initialization before the taskgroup body.
The reducer init and combiner callbacks are generated from the corresponding
omp.declare_reduction regions.

This patch keeps taskloop reduction and in_reduction translation unsupported;
those remain follow-up work. Unsupported task_reduction forms are diagnosed
instead of being lowered incorrectly.

Add MLIR translation tests for taskgroup task_reduction, multiple reducers,
plain taskgroup translation, and remaining unsupported cases.
DeltaFile
+255-7mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+238-0mlir/test/Target/LLVMIR/openmp-taskgroup-task-reduction.mlir
+55-3mlir/test/Target/LLVMIR/openmp-todo.mlir
+548-103 files

LLVM/project ec802a7libc/hdr/types struct_winsize.h CMakeLists.txt, libc/include/llvm-libc-types struct_winsize.h

[libc] Add TIOCGWINSZ and struct winsize support (#203919)

Added support for the TIOCGWINSZ ioctl command.

* Defined struct winsize in llvm-libc-types.
* Defined TIOCGWINSZ in linux/sys-ioctl-macros.h.
* Exposed struct_winsize and TIOCGWINSZ in sys/ioctl.yaml.
* Added struct_winsize proxy header in hdr/types/struct_winsize.h.
* Added a unit test in test/src/sys/ioctl/linux/ioctl_test.cpp.

Assisted-by: Automated tooling, human reviewed.
DeltaFile
+24-0libc/include/llvm-libc-types/struct_winsize.h
+23-0libc/hdr/types/struct_winsize.h
+19-0libc/test/src/sys/ioctl/linux/ioctl_test.cpp
+8-0libc/hdr/types/CMakeLists.txt
+4-1libc/include/sys/ioctl.yaml
+1-0libc/test/src/sys/ioctl/linux/CMakeLists.txt
+79-13 files not shown
+82-19 files

LLVM/project 18cc9aamlir/include/mlir/Conversion/EmitCCommon TypeConverter.h, mlir/lib/Conversion/EmitCCommon TypeConverter.cpp CMakeLists.txt

[mlir][emitc] Add a common type converter (#203763)

MemRef type conversion is currently implemented as part of the memref
dialect lowering pass, which means e.g. that func-to-emitc cannot lower
functions taking MemRef types as arguments.

This patch refactors the existing type conversions in EmitC's lowering
passes into a structure similar to the LLVM dialect by adding a common
EmitC type converter and using it across dialect-specfic EmitC lowering
passes and the generic convert-to-emitc pass.

Assisted-by: Copilot
DeltaFile
+58-0mlir/lib/Conversion/EmitCCommon/TypeConverter.cpp
+31-0mlir/include/mlir/Conversion/EmitCCommon/TypeConverter.h
+0-29mlir/lib/Conversion/MemRefToEmitC/MemRefToEmitC.cpp
+14-0mlir/test/Conversion/FuncToEmitC/func-to-emitc.mlir
+11-0mlir/lib/Conversion/EmitCCommon/CMakeLists.txt
+2-9mlir/lib/Conversion/MemRefToEmitC/MemRefToEmitCPass.cpp
+116-3812 files not shown
+133-6918 files

LLVM/project 089d063clang/lib/Format CMakeLists.txt

[clang-format][NFC] Don't always rebuild clang-format-check-format (#203828)

Instead, check the format of clan-format source only if the built
clang-format binary or one of the source files is newer.
DeltaFile
+4-1clang/lib/Format/CMakeLists.txt
+4-11 files

LLVM/project 08834adclang/lib/Format UnwrappedLineFormatter.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] Fix a bug in merging short inline functions (#203754)

Fixes #203209
DeltaFile
+13-13clang/lib/Format/UnwrappedLineFormatter.cpp
+9-0clang/unittests/Format/FormatTest.cpp
+22-132 files

LLVM/project 8cd2329llvm/lib/Target/RISCV/AsmParser RISCVAsmParser.cpp

drop unnecessary function

Created using spr 1.3.8-beta.1
DeltaFile
+0-1llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+0-11 files

LLVM/project 6548844llvm/lib/IR PrintPasses.cpp, llvm/lib/Transforms/IPO SampleProfileProbe.cpp

[llvm] Replace unordered_set<std::string> with StringSet (#204048)

std::unordered_set<std::string> without a pointer-stability requirement
can use StringSet: it avoids per-TU hashtable instantiations and the
std::string temporary at StringRef lookup sites (~3-4 KiB smaller .text
for llc/opt).
DeltaFile
+5-8llvm/lib/IR/PrintPasses.cpp
+3-5llvm/tools/llvm-profgen/ProfiledBinary.cpp
+4-4llvm/lib/Transforms/IPO/SampleProfileProbe.cpp
+2-3llvm/tools/llvm-readobj/ObjDumper.h
+2-2llvm/tools/llvm-config/llvm-config.cpp
+2-2llvm/tools/llvm-profgen/PerfReader.cpp
+18-244 files not shown
+25-3010 files

LLVM/project fa5d8f8lld/ELF Writer.cpp, lld/docs ReleaseNotes.rst

[ELF] Support multiple PT_GNU_RELRO when SECTIONS is used without PHDRS (#203675)

When a SECTIONS command interleaves relro and non-relro sections, the
relro
region is split into discontiguous runs. lld emits an error since
https://reviews.llvm.org/D40359

    error: section: <name> is not contiguous with other relro sections

This is overly strict: while glibc only honors the first PT_GNU_RELRO,
other loaders (e.g. Bionic and FreeBSD rtld-elf) protect every
PT_GNU_RELRO segment.

Emit one PT_GNU_RELRO segment for each contiguous run of relro sections.
Track the boundary section so that `createPhdrs` starts a fresh PT_LOAD
at each relro->non-relro transition, as before.

Consumers that don't expect multiple PT_GNU_RELRO should check the
output themselves.
DeltaFile
+93-23lld/test/ELF/relro-non-contiguous.s
+34-20lld/test/ELF/relro-non-contiguous-script-data.s
+21-20lld/ELF/Writer.cpp
+12-2lld/test/ELF/keep-data-section-prefix.s
+10-2lld/test/ELF/linkerscript/data-segment-relro.test
+4-0lld/docs/ReleaseNotes.rst
+174-676 files

LLVM/project 96e45c5llvm/include/llvm/ADT SCCIterator.h, llvm/include/llvm/DebugInfo/LogicalView/Readers LVDWARFReader.h

[llvm] Replace unordered_set<T *> with SmallPtrSet<T *, 0> (#204051)

std::unordered_set is slow. For pointer sets without a pointer-stability
or iterator-stability requirement, use SmallPtrSet<T *, 0> for a smaller
code size.
DeltaFile
+6-4llvm/tools/llvm-profgen/ProfiledBinary.h
+5-5llvm/lib/Passes/StandardInstrumentations.cpp
+5-4llvm/tools/llvm-profgen/ProfileGenerator.h
+4-4llvm/tools/llvm-profgen/ProfileGenerator.cpp
+2-2llvm/include/llvm/DebugInfo/LogicalView/Readers/LVDWARFReader.h
+2-2llvm/include/llvm/ADT/SCCIterator.h
+24-212 files not shown
+28-258 files

LLVM/project 2cb8b61llvm/include/llvm/Analysis TargetTransformInfoImpl.h

[TTI] Add missing no-cost coroutine intrinsics (#203816)

These intrinsics are lowered in the CoroCleanup pass and don't represent
actual code. This patch adds them to the no-cost list so they do not
contribute to the cost of inlining and optimization.
DeltaFile
+6-0llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+6-01 files

LLVM/project 1f3d3d6flang/include/flang/Semantics symbol.h, flang/lib/Lower/OpenMP OpenMP.cpp

[flang][mlir] Add flang to mlir lowering for groupprivate (#180934)

This PR implements the Flang frontend lowering for the OpenMP
`groupprivate`
Changes:
- Update genOMP handler for OpenMPGroupprivate in OpenMP.cpp to generate
`omp.groupprivate` MLIR operation.
- Add clause processing for groupprivate variables
- Add test cases for groupprivate lowering

Co-Authored-By: Claude
[noreply at anthropic.com](mailto:noreply at anthropic.com)
DeltaFile
+276-0flang/test/Lower/OpenMP/groupprivate.f90
+146-1flang/lib/Lower/OpenMP/OpenMP.cpp
+36-0flang/test/Lower/OpenMP/groupprivate-modfile.f90
+32-1flang/lib/Semantics/resolve-directives.cpp
+23-4flang/include/flang/Semantics/symbol.h
+18-5flang/lib/Semantics/symbol.cpp
+531-113 files not shown
+551-229 files

LLVM/project dab3476llvm/test/CodeGen/RISCV/rvv fixed-vectors-masked-gather.ll fixed-vectors-vpgather.ll

[RISCV] Consider known leading zeros in narrowIndex for gather/scatter. (#203970)

If there are enough leading zeros for the shift amount, then
we can do the shift in the narrow type.
DeltaFile
+143-0llvm/test/CodeGen/RISCV/rvv/fixed-vectors-masked-gather.ll
+26-0llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vpgather.ll
+24-0llvm/test/CodeGen/RISCV/rvv/vpgather-sdnode.ll
+22-0llvm/test/CodeGen/RISCV/rvv/vpscatter-sdnode.ll
+15-0llvm/test/CodeGen/RISCV/rvv/mscatter-sdnode.ll
+14-0llvm/test/CodeGen/RISCV/rvv/mgather-sdnode.ll
+244-01 files not shown
+250-07 files

LLVM/project d22a0ecllvm/lib/Target/AMDGPU GCNHazardRecognizer.cpp GCNHazardRecognizer.h, llvm/test/CodeGen/AMDGPU misched-into-wmma-hazard-shadow.mir

[AMDGPU] Track VALU instructions separately for WMMA coexecution hazards

WMMA coexecution hazards can only be resolved by VALU instructions, not
S_NOPs. Track VALU/WMMA instructions separately so the scheduler can
accurately determine stall cycles.
DeltaFile
+59-10llvm/lib/Target/AMDGPU/GCNHazardRecognizer.cpp
+36-0llvm/test/CodeGen/AMDGPU/misched-into-wmma-hazard-shadow.mir
+16-0llvm/lib/Target/AMDGPU/GCNHazardRecognizer.h
+111-103 files

LLVM/project 1503720llvm/lib/Target/AMDGPU SIPreEmitPeephole.cpp, llvm/test/CodeGen/AMDGPU wmma-set-reuse-bits.mir

[AMDGPU] Set WMMA source-operand reuse bits in SIPreEmitPeephole

gfx1250 WMMA instructions can set matrix_a_reuse / matrix_b_reuse bits
that keep the A or B source operand in a high-temporality state in the
VALU source-operand cache, so a later WMMA reusing the same registers
hits in the cache instead of re-reading the register file.

Add a late, post-RA peephole in the existing pre-emit peephole pass that
scans each basic block and, for every WMMA, sets the A/B reuse bit when
one of the next few WMMAs reuses the same physical registers as its A or B
operand and those registers are not redefined in between.

Stale sticky entries in the cache are cleared when a register is used in
an instruction without a reuse bit being set. Therefore, the final WMMA
use of the same source should not set the bit.
DeltaFile
+105-0llvm/test/CodeGen/AMDGPU/wmma-set-reuse-bits.mir
+95-0llvm/lib/Target/AMDGPU/SIPreEmitPeephole.cpp
+200-02 files