LLVM/project fcd0e2clld/ELF BPSectionOrderer.cpp

[ELF] Remove redundant sec->repl != sec check in BPSectionOrderer. NFC (#189214)

ICF's InputSection::replace() calls markDead() on folded sections, so
`!sec->isLive()` already filters them.
DeltaFile
+3-4lld/ELF/BPSectionOrderer.cpp
+3-41 files

LLVM/project 71263dcclang-tools-extra/clang-tidy/bugprone IncDecInConditionsCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Fix bugprone-inc-dec-in-conditions FP with lambda condition (#189145)

Fixes https://github.com/llvm/llvm-project/issues/163913.
DeltaFile
+44-0clang-tools-extra/test/clang-tidy/checkers/bugprone/inc-dec-in-conditions.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+4-1clang-tools-extra/clang-tidy/bugprone/IncDecInConditionsCheck.cpp
+53-13 files

LLVM/project cf3a0f2lld Maintainers.md

[lld] update maintainers (#183803)

As a new contributor, it helps to correctly see the right maintainer.
DeltaFile
+15-3lld/Maintainers.md
+15-31 files

LLVM/project fed86edllvm/lib/Analysis DependenceAnalysis.cpp

Remove redundant logic

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+0-6llvm/lib/Analysis/DependenceAnalysis.cpp
+0-61 files

LLVM/project 7bf3681llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Fix overflow of calculation in weakCrossingSIVtest

This patch fixes a correctness issue where integer overflow in the
upper bound calculation of weakCrossingSIVtest caused the pass to
incorrectly prove independence.

The previous logic used `SCEV::getMulExpr` to calculate
`2 * ConstCoeff * UpperBound` and compared it to `Delta` using
`isKnownPredicate`. In the presence of overflow, this could yield
unsafe results.

This change replaces the SCEV arithmetic with `ConstantRange` and
its operation (`smul_fast`). If the calculation overflows,
`intersectWith(MLRange).isEmptySet()` would be false, ensures we
conservatively assume a dependence if the bounds cannot be proven
safe.

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+17-5llvm/lib/Analysis/DependenceAnalysis.cpp
+17-51 files

LLVM/project daf6df9llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-crossing-siv-overflow.ll

update

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+59-0llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-overflow.ll
+16-15llvm/lib/Analysis/DependenceAnalysis.cpp
+75-152 files

LLVM/project dda64e6llvm/lib/Analysis DependenceAnalysis.cpp

Remove redundant logic

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+0-6llvm/lib/Analysis/DependenceAnalysis.cpp
+0-61 files

LLVM/project 670de1fcompiler-rt/lib/msan msan_linux.cpp

[compiler-rt][msan] Fix 32-bit overflow in CheckMemoryLayoutSanity (#189199)

Use start + (end - start) / 2 instead of (start + end) / 2 to compute
the midpoint address. The original expression overflows when start + end
exceeds UPTR_MAX, which happens on 32-bit targets whose memory layout
includes regions above 0x80000000.
DeltaFile
+3-2compiler-rt/lib/msan/msan_linux.cpp
+3-21 files

LLVM/project 89d57d0compiler-rt/lib/sanitizer_common sanitizer_platform_limits_posix.cpp

[compiler-rt][sanitizer] Add struct_rlimit64_sz for musl (#189197)

On musl, rlimit64 is an alias for rlimit rather than a distinct type
provided by glibc. Add a SANITIZER_MUSL elif branch so that
struct_rlimit64_sz is defined for musl-based Linux targets.
DeltaFile
+5-2compiler-rt/lib/sanitizer_common/sanitizer_platform_limits_posix.cpp
+5-21 files

LLVM/project 6ead686clang-tools-extra/clang-tidy/misc IncludeCleanerCheck.cpp, clang-tools-extra/clang-tidy/readability IdentifierNamingCheck.cpp ImplicitBoolConversionCheck.cpp

[clang-tidy][NFC] Run `performance-faster-string-find` over the codebase (#189202)
DeltaFile
+7-7clang-tools-extra/clang-tidy/readability/IdentifierNamingCheck.cpp
+4-4clang-tools-extra/clang-tidy/readability/ImplicitBoolConversionCheck.cpp
+3-3clang-tools-extra/clang-tidy/misc/IncludeCleanerCheck.cpp
+2-2clang-tools-extra/clang-tidy/utils/Matchers.h
+1-1clang-tools-extra/clang-tidy/readability/UseStdMinMaxCheck.cpp
+1-1clang-tools-extra/clang-tidy/readability/IsolateDeclarationCheck.cpp
+18-185 files not shown
+23-2311 files

LLVM/project 458f1aapolly/include/polly ScopInliner.h, polly/lib/Support RegisterPasses.cpp

[Polly] Forward VFS from PassBuilder for IO sandboxing (#188657)

#184545 default-enables the IO sandbox in assert-builds. This causes
Clang using Polly to crash (#188568).

The issue is that `PassBuilder` uses `vfs::getRealFileSystem()` by
default which is considered a IO sandbox violation in the Clang process.
With this PR store the VFS from the `PassBuilder` from the original
`registerPollyPasses` call for creating other `PassBuilder` instances.

This PR also adds infrastructure for running Polly in `clang` (in
addition in `opt`). `opt` does not enable the sandbox such that we need
separate tests using Clang.

Closes: #188568
DeltaFile
+26-11polly/lib/Support/RegisterPasses.cpp
+0-23polly/test/CodeGen/RuntimeDebugBuilder/combine_different_values.c
+21-0polly/test/lit.cfg
+11-4polly/lib/Transform/ScopInliner.cpp
+10-1polly/include/polly/ScopInliner.h
+9-0polly/test/polly.c
+77-396 files not shown
+89-4212 files

LLVM/project 191a9a9llvm/utils/gn/secondary/clang/unittests/Format BUILD.gn

[gn build] Port c0bd2f9084d7
DeltaFile
+1-0llvm/utils/gn/secondary/clang/unittests/Format/BUILD.gn
+1-01 files

LLVM/project c0bd2f9clang/unittests/Format AlignmentTest.cpp FormatTest.cpp

[clang-format][NFC] Extract some alignment tests

FormatTest.cpp is too huge, extract some tests to mitigate this a bit.
DeltaFile
+3,566-0clang/unittests/Format/AlignmentTest.cpp
+0-3,543clang/unittests/Format/FormatTest.cpp
+1-0clang/unittests/Format/CMakeLists.txt
+3,567-3,5433 files

LLVM/project 617ec39llvm/test/Transforms/LoopVectorize/VPlan dissolve-replicate-regions.ll

[VPlan] Add printing test for dissolving replicate regions. (#189192)

Add VPlan printing test for
 https://github.com/llvm/llvm-project/pull/186252
 https://github.com/llvm/llvm-project/pull/189022
DeltaFile
+202-0llvm/test/Transforms/LoopVectorize/VPlan/dissolve-replicate-regions.ll
+202-01 files

LLVM/project 4450891llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 non-reduced-select-of-bits.ll

[SLP] Check if potential bitcast/bswap candidate is a root of reduction

Need to check if the potential bitcast/bswap-like construct is a root of
the reduction, otherwise it cannot represent a bitcast/bswap construct.

Fixes #189184
DeltaFile
+55-0llvm/test/Transforms/SLPVectorizer/X86/non-reduced-select-of-bits.ll
+1-1llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+56-12 files

LLVM/project 8871145llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DependenceAnalysis weak-crossing-siv-overflow.ll

update

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+59-0llvm/test/Analysis/DependenceAnalysis/weak-crossing-siv-overflow.ll
+16-15llvm/lib/Analysis/DependenceAnalysis.cpp
+75-152 files

LLVM/project a0a5938llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Fix overflow of calculation in weakCrossingSIVtest

This patch fixes a correctness issue where integer overflow in the
upper bound calculation of weakCrossingSIVtest caused the pass to
incorrectly prove independence.

The previous logic used `SCEV::getMulExpr` to calculate
`2 * ConstCoeff * UpperBound` and compared it to `Delta` using
`isKnownPredicate`. In the presence of overflow, this could yield
unsafe results.

This change replaces the SCEV arithmetic with `ConstantRange` and
its operation (`smul_fast`). If the calculation overflows,
`intersectWith(MLRange).isEmptySet()` would be false, ensures we
conservatively assume a dependence if the bounds cannot be proven
safe.

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+17-5llvm/lib/Analysis/DependenceAnalysis.cpp
+17-51 files

LLVM/project ba6041bllvm/lib/Transforms/Scalar LowerMatrixIntrinsics.cpp, llvm/test/Transforms/LowerMatrixIntrinsics multiply-fused-differing-addr-spaces.ll

[Matrix] Handle load/store with different AS in getNonAliasingPointer. (#188721)

If a load and a store have different address spaces, we cannot create a
runtime check. Instead, always copy the data to an alloca matching the
store address space.

Fixes https://github.com/llvm/llvm-project/issues/185236.

PR: https://github.com/llvm/llvm-project/pull/188721
DeltaFile
+369-0llvm/test/Transforms/LowerMatrixIntrinsics/multiply-fused-differing-addr-spaces.ll
+15-0llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp
+384-02 files

LLVM/project 807191allvm/lib/Analysis DependenceAnalysis.cpp

[DA] Hoist division check for early exit in weakCrossingSIVtest (NFC)

This patch moves the check that `Coeff` divides `Delta` earlier in the
function to enable an early exit. Potentially improve performance.

Signed-off-by: Ruoyu Qiu <cabbaken at outlook.com>
DeltaFile
+21-21llvm/lib/Analysis/DependenceAnalysis.cpp
+21-211 files

LLVM/project 73c8ed0llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Rebase, address comments

Created using spr 1.3.7
DeltaFile
+60-38llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+60-381 files

LLVM/project 28674ddllvm/lib/Target/AMDGPU AMDGPUInstCombineIntrinsic.cpp, llvm/test/Transforms/InstCombine/AMDGPU mbcnt.ll llvm.amdgcn.wave.shuffle.ll

AMDGPU: Add range attribute to mbcnt intrinsic callsites

It seems the known bits handling added in 686987a540bc176bceaad43ffe530cb3e88796d5
is insufficient to perform many range based optimizations. For some reason
computeConstantRange doesn't fall back on KnownBits, and has a separate,
less used form which tries to use computeKnownBits.
DeltaFile
+236-15llvm/test/Transforms/InstCombine/AMDGPU/mbcnt.ll
+22-22llvm/test/Transforms/InstCombine/AMDGPU/llvm.amdgcn.wave.shuffle.ll
+22-2llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
+1-1llvm/test/Transforms/InstCombine/AMDGPU/canonicalize-add-to-gep.ll
+281-404 files

LLVM/project 9015a18llvm/lib/Target/AMDGPU AMDGPULegalizerInfo.cpp SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU rsq.f64.ll fdiv.f64.ll

AMDGPU: Skip last corrections in afn f64 reciprocal

Device libs has a fast reciprocal macro that is close
to the fast division expansion, but skips the last terms
compared to the full division.

The basic reciprocal handling has identical output to this
macro. The negative reciprocal case has different fneg placement
and smaller code size, but I believe should be the same.
DeltaFile
+32-116llvm/test/CodeGen/AMDGPU/rsq.f64.ll
+37-7llvm/test/CodeGen/AMDGPU/GlobalISel/fdiv.f64.ll
+17-1llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+16-1llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+12-2llvm/test/CodeGen/AMDGPU/fdiv.f64.ll
+0-4llvm/test/CodeGen/AMDGPU/fneg-combines.new.ll
+114-1311 files not shown
+114-1337 files

LLVM/project 805a814llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 copyable_reorder.ll bottom-to-top-reorder.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+21-44llvm/test/Transforms/SLPVectorizer/X86/copyable_reorder.ll
+55-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+1-1llvm/test/Transforms/SLPVectorizer/X86/bottom-to-top-reorder.ll
+1-1llvm/test/Transforms/SLPVectorizer/X86/reused-last-instruction-in-split-node.ll
+78-464 files

LLVM/project 9f3a9eamlir/lib/Dialect/XeGPU/Transforms XeGPUSgToWiDistributeExperimental.cpp, mlir/test/Dialect/XeGPU sg-to-wi-experimental-unit.mlir

[MLIR][XeGPU] Add distribution patterns for vector step, shape_cast & broadcast from sg-to-wi (#185960)

This PR adds distribution patterns for vector.step, vector.shape_cast &
vector.broadcast in the new sg-to-wi pass
DeltaFile
+220-1mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToWiDistributeExperimental.cpp
+162-0mlir/test/Dialect/XeGPU/sg-to-wi-experimental-unit.mlir
+382-12 files

LLVM/project 9d6b92ellvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/test/CodeGen/AArch64 knownpow2-trunc-orzero.ll

[DAG] SelectionDAG::isKnownToBeAPowerOfTwo - add ISD::TRUNCATE handling and tests (#184365)

Closes #181654
DeltaFile
+34-0llvm/test/CodeGen/AArch64/knownpow2-trunc-orzero.ll
+9-11llvm/test/CodeGen/X86/known-pow2.ll
+5-0llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+48-113 files

LLVM/project 3f42ec6clang/lib/Format TokenAnnotator.cpp, clang/unittests/Format TokenAnnotatorTest.cpp

[clang-format] Fix annotation of references in function pointer typedefs (#188860)

Fixes #188695
DeltaFile
+6-0clang/unittests/Format/TokenAnnotatorTest.cpp
+1-1clang/lib/Format/TokenAnnotator.cpp
+7-12 files

LLVM/project 0ac35ecclang/lib/Format UnwrappedLineParser.cpp Format.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] Fix breaking enum braces when combined with export (#189128)

This fixes #186684.

Also fix (not) breaking variables declared on the same line as the
closing brace.

And adapt whitesmith to that changes.
DeltaFile
+17-3clang/lib/Format/UnwrappedLineParser.cpp
+11-1clang/unittests/Format/FormatTest.cpp
+6-0clang/lib/Format/Format.cpp
+2-1clang/lib/Format/TokenAnnotator.cpp
+36-54 files

LLVM/project f0ce26dlibc/src/__support/math log2p1f16.h CMakeLists.txt, libc/src/math log2p1f16.h

[libc][math][c23] Add log2p1f16 C23 math function (#186754)

Signed-off-by: Shikhar Soni <shikharish05 at gmail.com>
DeltaFile
+207-0libc/src/__support/math/log2p1f16.h
+49-0libc/test/src/math/log2p1f16_test.cpp
+48-0libc/test/src/math/smoke/log2p1f16_test.cpp
+21-0libc/src/math/log2p1f16.h
+18-0libc/src/__support/math/CMakeLists.txt
+18-0libc/src/math/generic/log2p1f16.cpp
+361-018 files not shown
+433-124 files

LLVM/project 15940b1libcxx/cmake/caches AMDGPU.cmake NVPTX.cmake

[libcxx] Update GPU cache files to use the proper loader

Summary:
These were renamed and the aliases removed, fix running the tests.
DeltaFile
+1-1libcxx/cmake/caches/AMDGPU.cmake
+1-1libcxx/cmake/caches/NVPTX.cmake
+2-22 files

LLVM/project e568136mlir/python/mlir/dialects ext.py, mlir/test/python/dialects ext.py

[MLIR][Python] Add more field specifiers to Python-defined operations (#188064)

This PR adds two new field specifiers (`operand` and `attribute`) and
extends the existing one (`result`):
- `default_factory` parameter is added for `result` and `attribute` to
specify default value via a lambda/function
- `kw_only` parameter is added for all these three specifiers, to make a
field a keyword-only parameter (without giving a default value).

```python
def result(
    *,
    infer_type: bool = False,
    default_factory: Optional[Callable[[], Any]] = None,
    kw_only: bool = False,
) -> Any: ...


def operand(

    [43 lines not shown]
DeltaFile
+131-29mlir/python/mlir/dialects/ext.py
+99-1mlir/test/python/dialects/ext.py
+230-302 files