LLVM/project a9df7c7llvm/lib/Target/AMDGPU SIInstructions.td, llvm/test/CodeGen/AMDGPU bf16-math.ll

[AMDGPU] True16 support for bf16 clamp pattern on gfx1250 (#190036)
DeltaFile
+174-55llvm/test/CodeGen/AMDGPU/bf16-math.ll
+9-1llvm/lib/Target/AMDGPU/SIInstructions.td
+183-562 files

LLVM/project c6669c4llvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/AArch64 fma-conversion-multi-use-guard.ll

[SLP] Guard FMulAdd conversion to require single-use/non-reordered FMul operands

The FMulAdd (CombinedVectorize) transformation in transformNodes() marks
an FMul child entry with zero cost, assuming it is fully absorbed into
the fmuladd intrinsic. However, when any FMul scalar has multiple uses
(e.g., also stored separately), the FMul must survive as a separate
node.

Reviewers: hiraditya, RKSimon, bababuck

Pull Request: https://github.com/llvm/llvm-project/pull/189692
DeltaFile
+6-14llvm/test/Transforms/SLPVectorizer/AArch64/fma-conversion-multi-use-guard.ll
+16-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+22-142 files

LLVM/project 6c92374llvm/lib/Analysis ValueTracking.cpp, llvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-fract.ll

ValueTracking: llvm.amdgcn.fract cannot introduce overflow (#189002)

This returns a value with an absolute value less than 1.
DeltaFile
+26-0llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-fract.ll
+2-1llvm/lib/Analysis/ValueTracking.cpp
+28-12 files

LLVM/project 478a6ablldb/packages/Python/lldbsuite/test/make Makefile.rules

[lldb/test] Codesign executables built with custom Makefile rules (#189902)

Tests with custom a.out targets in their Makefile (i.e.
`TestBSDArchives.py`) bypass the standard Makefile.rules linking step
where `CODESIGN` is applied. This leaves the binary unsigned, causing
the process to get kill it on remote darwin devices.

This adds a codesigning step to the all target in Makefile.rules that
signs both $(EXE) and a.out if they exist. This ensures all test
binaries are signed regardless of how they were built.

rdar://173840592

Signed-off-by: Med Ismail Bennani <ismail at bennani.ma>
DeltaFile
+7-0lldb/packages/Python/lldbsuite/test/make/Makefile.rules
+7-01 files

LLVM/project b75bf1eclang/docs ReleaseNotes.rst, clang/lib/Analysis ThreadSafety.cpp

Revert "Thread Safety Analysis: Drop call-based alias invalidation (#187691)" (#190041)

This reverts commit 873d6bc3b415f1c2d942bbf4e4219c4bdcd4f2f8.

This causes Linux kernel build to fail because it relied on
alias-invalidation in kernel/core/sched.c.
DeltaFile
+52-0clang/lib/Analysis/ThreadSafety.cpp
+11-26clang/test/SemaCXX/warn-thread-safety-analysis.cpp
+0-5clang/docs/ReleaseNotes.rst
+63-313 files

LLVM/project 9f50004mlir/lib/Dialect/XeGPU/Transforms XeGPUPeepHoleOptimizer.cpp, mlir/test/Dialect/XeGPU peephole-optimize.mlir

[MLIR][XeGPU] Enhance the peephole optimization to remove the convert_layout after multi-reduction rewrite (#188849)
DeltaFile
+57-28mlir/test/Dialect/XeGPU/peephole-optimize.mlir
+25-0mlir/lib/Dialect/XeGPU/Transforms/XeGPUPeepHoleOptimizer.cpp
+82-282 files

LLVM/project 09264aeoffload CMakeLists.txt

Merge commit '61a43720f3e31357ff3842a02d5460e71e4062a6' into HEAD
DeltaFile
+0-114offload/CMakeLists.txt
+0-1141 files

LLVM/project 61a4372offload CMakeLists.txt

Merge commit '1e19b4364dd3f827e4110b0bc14ec31bf5bbaf59' into HEAD
DeltaFile
+0-115offload/CMakeLists.txt
+0-1151 files

LLVM/project 1e19b43offload CMakeLists.txt

Fix incomplete merge
DeltaFile
+0-115offload/CMakeLists.txt
+0-1151 files

LLVM/project d6d0876llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp

[NFC][SelectionDAG] Refactor out common default `DemandedElts` calculation (#190031)

Deduplicating the repeated pattern
```cpp
APInt DemandedElts = VT.isFixedLengthVector()
                         ? APInt::getAllOnes(VT.getVectorNumElements())
                         : APInt(1, 1);
```
in SelectionDAG.
DeltaFile
+26-85llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+26-851 files

LLVM/project d0fdb9cllvm/test/Transforms/Attributor/AMDGPU nofpclass-amdgcn-fract.ll

Update llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-fract.ll

Co-authored-by: Yingwei Zheng <dtcxzyw2333 at gmail.com>
DeltaFile
+1-1llvm/test/Transforms/Attributor/AMDGPU/nofpclass-amdgcn-fract.ll
+1-11 files

LLVM/project 0045d79llvm/lib/Target/AMDGPU SIInstructions.td, llvm/test/CodeGen/AMDGPU bf16-math.ll

[AMDGPU] True16 support for bf16 clamp pattern on gfx1250
DeltaFile
+174-55llvm/test/CodeGen/AMDGPU/bf16-math.ll
+9-1llvm/lib/Target/AMDGPU/SIInstructions.td
+183-562 files

LLVM/project 2b87d02clang-tools-extra/clang-tidy/tool run-clang-tidy.py

[clang-tidy] Properly escape printed clang-tidy command in `run-clang-tidy.py` (#189974)

The `run-clang-tidy.py` script now uses `shlex.join()` to construct the
command string for printing.

This ensures that arguments containing shell metacharacters, such as the
asterisk in `--warnings-as-errors=*`, are correctly quoted. This allows
the command to be safely copied and pasted into any shell for manual
execution, fixing errors previously seen with shells like `fish` that
are strict about wildcard expansion.

Before:
```
[ 1/15][0.2s] /usr/bin/clang-tidy -p=/home/user/work/project/build --warnings-as-errors=* /home/user/work/project/src/main.cpp
```

Note: When running this command in fish shell you get some error like
`fish: No matches for wildcard '--warnings-as-errors=*'. See `help
wildcards-globbing``

    [4 lines not shown]
DeltaFile
+2-1clang-tools-extra/clang-tidy/tool/run-clang-tidy.py
+2-11 files

LLVM/project 7757006clang-tools-extra/clang-doc Representation.cpp Representation.h, clang-tools-extra/clang-doc/tool ClangDocMain.cpp

[clang-doc] Merge data into persistent memory

We have a need for persistent memory for the final info. Since each
group processes a single USR at a time, every USR is only ever processed by
a single thread from the thread pool. This means that we can keep per
thread persistent storage for all the info. There is significant
duplicated data between all the serialized records, so we can just merge
the final/unique items into the persistent arena, and clear out the
scratch/transient arena as we process each record in the bitcode.

The patch adds some APIs to help with managing the data, merging, and
allocation of data in the correct arena. It also safely merges and deep
copies data from the transient arenas into persistent storage that is
never reset until the program completes.

This patch reduces memory by another % over the previous patches,
bringing the total savings over the baseline to 57%. Runtime performance
and benchmarks stay mostly flat with modest improvements.


    [31 lines not shown]
DeltaFile
+134-10clang-tools-extra/clang-doc/Representation.cpp
+25-25clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+7-0clang-tools-extra/clang-doc/Representation.h
+166-353 files

LLVM/project 93bdfe2clang-tools-extra/clang-doc Representation.cpp Representation.h

[clang-doc] Support deep copy between arenas for merging

Upcoming changes to the merge step will necessitate that we clear the
transient arenas and merge new items into the persistent arena. However
there are some challenges with that, as the existing types typically
don't want to be copied. We introduce some new APIs to simplify that
task and ensure we don't accidentally leak memory.

On the performance front, we reclaim about 2% of the overhead, bringing
the cumulative overhead from the series of patches down to about 7% over
the baseline.

| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 1014.5s | 991.5s | +7.7% | -2.3% |
| Memory | 86.0G | 39.9G | 40.0G | -53.4% | +0.3% |

| Benchmark | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |

    [28 lines not shown]
DeltaFile
+140-21clang-tools-extra/clang-doc/Representation.cpp
+30-0clang-tools-extra/clang-doc/Representation.h
+170-212 files

LLVM/project 707229cclang-tools-extra/clang-doc BitcodeReader.cpp Serialize.cpp, clang-tools-extra/unittests/clang-doc SerializeTest.cpp MergeTest.cpp

[clang-doc] Move Info types into arenas

Info types used to own significant chunks of data. As we move these into
local arenas, these types must be trivially destructible, to avoid
leaking resources when the arena is reset. Unfortunaly, there isn't a
good way to transition all the data types one at a time, since most of
them are tied together in some way. Further, as they're now allocated in
the arenas, they often cannot be treated the same way, and even the
aliases and interfaces put in pLace to simplify the transition cannot
cover the full range of changes required.

We also use some SFINAE tricks to avoid adding boilerplate for helper
APIs, we'd otherwise ahve to support

Though it introduces some additional churn, we also try to keep tests
from using arena allocation as much as possible, since this is not
required to test the implementation of the library. As much of the test
code needed to be rewritten anyway, we take the opportunity to
transition now.

    [41 lines not shown]
DeltaFile
+419-187clang-tools-extra/clang-doc/BitcodeReader.cpp
+246-189clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
+196-129clang-tools-extra/unittests/clang-doc/MergeTest.cpp
+176-80clang-tools-extra/unittests/clang-doc/ClangDocTest.cpp
+137-75clang-tools-extra/clang-doc/Serialize.cpp
+71-41clang-tools-extra/unittests/clang-doc/YAMLGeneratorTest.cpp
+1,245-70114 files not shown
+1,661-95220 files

LLVM/project 759fba7clang-tools-extra/clang-doc BitcodeReader.cpp BitcodeReader.h

[clang-doc] Simplify parsing and reading bitcode blocks

Much of the logic int he readBlock implementation is boilerplate, and is
repeated for each implementation/specialization. This will become much
worse as we introduce new custom block reading logic as we migrate
towards arena allocation. In preparation for that, we're introducing the
change in logic now, which should make later refactoring much more
straightforward.
DeltaFile
+103-120clang-tools-extra/clang-doc/BitcodeReader.cpp
+5-0clang-tools-extra/clang-doc/BitcodeReader.h
+1-1clang-tools-extra/clang-doc/Representation.h
+109-1213 files

LLVM/project 4e76a79clang/lib/Format ContinuationIndenter.cpp, clang/unittests/Format FormatTest.cpp

[clang-format] fix aligning inheritance lists  and binary operator operands with UT_AlignWithSpaces (#189218)

fix aligning inheritance lists with UT_AlignWithSpaces
fix aligning binary operator operands

---------

Co-authored-by: Eugene Shalygin <e.shalygin at abberior-instruments.com>
DeltaFile
+59-0clang/unittests/Format/FormatTest.cpp
+22-1clang/lib/Format/ContinuationIndenter.cpp
+81-12 files

LLVM/project a89c778clang/docs ClangFormatStyleOptions.rst, clang/include/clang/Format Format.h

[clang-format] Add SpaceBeforeEnumUnderlyingTypeColon for enum underlying types (#189011)

Introduce a new formatting option to control spacing before enum
underlying type colons. This preserves existing behavior while
allowing independent control from inheritance colon spacing.

Previously, enum underlying type colons were not configurable.
 
Fixes #188734

---------

Co-authored-by: Tharun V K <Tharun.V.K at ibm.com>
DeltaFile
+19-3clang/unittests/Format/TokenAnnotatorTest.cpp
+10-0clang/unittests/Format/FormatTest.cpp
+10-0clang/docs/ClangFormatStyleOptions.rst
+8-0clang/include/clang/Format/Format.h
+4-0clang/lib/Format/TokenAnnotator.cpp
+3-0clang/lib/Format/Format.cpp
+54-33 files not shown
+58-39 files

LLVM/project 2ca1710clang-tools-extra/clang-doc Generators.h ClangDoc.cpp, clang-tools-extra/clang-doc/benchmarks ClangDocBenchmark.cpp

[clang-doc] Move non-arena allocated types off the OwnedPtr alias

Some types should not be using this alias, which was over applied to
APIs that wont participate in arena style allocation. This patch
restores them to their correct spelling.
DeltaFile
+7-7clang-tools-extra/clang-doc/Generators.h
+4-4clang-tools-extra/clang-doc/ClangDoc.cpp
+4-4clang-tools-extra/clang-doc/MDMustacheGenerator.cpp
+3-3clang-tools-extra/clang-doc/HTMLGenerator.cpp
+2-2clang-tools-extra/clang-doc/benchmarks/ClangDocBenchmark.cpp
+1-1clang-tools-extra/clang-doc/Generators.cpp
+21-211 files not shown
+22-227 files

LLVM/project 44f72d5llvm/lib/Transforms/Vectorize SLPVectorizer.cpp

Fix formatting

Created using spr 1.3.7
DeltaFile
+1-2llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+1-21 files

LLVM/project 8324d01clang-tools-extra/clang-doc Representation.cpp

[clang-doc] Consolidate merging logic

As we migrate things in the arena, this logic may get more complex.
Factoring it out now, will give clear extension points to make this
easier to manage.
DeltaFile
+10-9clang-tools-extra/clang-doc/Representation.cpp
+10-91 files

LLVM/project c5f67a0llvm/lib/Analysis DependenceAnalysis.cpp

[DA] Perform `nsw` check for SIV AddRecs earlier (NFCI) (#189740)

The repetitive code at the start of each SIV test can be factored out to
`testSIV` function
DeltaFile
+7-16llvm/lib/Analysis/DependenceAnalysis.cpp
+7-161 files

LLVM/project ad0dbacllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/RISCV complex-loads.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+6-22llvm/test/Transforms/SLPVectorizer/X86/pr47629.ll
+6-22llvm/test/Transforms/SLPVectorizer/X86/pr47629-inseltpoison.ll
+2-4llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+1-0llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+15-484 files

LLVM/project 76597d0clang-tools-extra/clang-doc BitcodeReader.cpp Serialize.cpp, clang-tools-extra/unittests/clang-doc MDGeneratorTest.cpp BitcodeTest.cpp

[clang-doc] Make CommentInfo arena allocated

This patch move the CommentInfo type into the arena. It updates block
handling to collect child info types and serialize the array in one
shot.

We also clean up the test code to avoid using the arenas in the tests.
This has the upside of making the test more hermetic, and avoids churn
in the related code as the allocation API interfaces evolve.

Performance and memory usage regress slightly. This is somewhat expected
as we do not yet aggressively release short term memory during merge
operations. Future patches will reclaim this overhead.

| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 998.5s | 1010.5s | +9.8% | +1.2% |
| Memory | 86.0G | 43.8G | 47.8G | -44.4% | +9.2% |


    [26 lines not shown]
DeltaFile
+124-94clang-tools-extra/unittests/clang-doc/MDGeneratorTest.cpp
+70-111clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp
+66-103clang-tools-extra/unittests/clang-doc/YAMLGeneratorTest.cpp
+15-30clang-tools-extra/unittests/clang-doc/MergeTest.cpp
+17-9clang-tools-extra/clang-doc/BitcodeReader.cpp
+15-5clang-tools-extra/clang-doc/Serialize.cpp
+307-3526 files not shown
+345-37012 files

LLVM/project dd5ef51clang-tools-extra/clang-doc Representation.h

[clang-doc] Enforce arena allocated types are trivially destructible

We can enforce at compile-time that the types we want to place in the
arenas are always safe to allocate there.
DeltaFile
+26-0clang-tools-extra/clang-doc/Representation.h
+26-01 files

LLVM/project 791e68allvm/test/CodeGen/MLRegAlloc dev-mode-logging.ll dev-mode-prio-logging.ll, llvm/test/CodeGen/MLRegAlloc/Inputs reference-log-noml.txt reference-prio-log-noml.txt

[MLGO] Update regalloc tests after c245d764b8bd70ff78044f56b2dea619b0… (#190025)

…d428dc

This caused some codegen changes that caused different calculations in
places. Update the tests to adapat to the changes.
DeltaFile
+37-37llvm/test/CodeGen/MLRegAlloc/Inputs/reference-log-noml.txt
+11-11llvm/test/CodeGen/MLRegAlloc/Inputs/reference-prio-log-noml.txt
+4-4llvm/test/CodeGen/MLRegAlloc/dev-mode-logging.ll
+1-1llvm/test/CodeGen/MLRegAlloc/dev-mode-prio-logging.ll
+53-534 files

LLVM/project 84f23eblibc/config config.json, libc/src/__support CMakeLists.txt

Revert "[libc] Finetune libc.src.__support.OSUtil.osutil dependency." (#190033)

Reverts llvm/llvm-project#189501

Buildbot failure on libc for GPU buildbots
DeltaFile
+9-17libc/src/__support/CMakeLists.txt
+5-19libc/test/UnitTest/CMakeLists.txt
+2-21libc/test/UnitTest/TestLogger.cpp
+0-6libc/src/time/linux/CMakeLists.txt
+0-6libc/src/unistd/CMakeLists.txt
+0-6libc/config/config.json
+16-753 files not shown
+17-879 files

LLVM/project 401ba6dmlir/include/mlir/Dialect/XeGPU/IR XeGPUOps.td, mlir/include/mlir/Dialect/XeGPU/Transforms XeGPULayoutImpl.h

[MLIR][XeGPU] Add Layout Propagation support for multi-reduction/reduction op with scalar result (#189133)

This PR add Layout Propagation support for multi-reduction/reduction op
with scalar result:
1) Enhance setupMultiReductionResultLayout() and
LayoutInfoPropagation::visitVectorMultiReductionOp() to support scalar
result
2) Add propagation support for vector.reduction op at the lane level,
since the op is only introduced at the lane level.
DeltaFile
+67-28mlir/lib/Dialect/XeGPU/Transforms/XeGPULayoutImpl.cpp
+72-6mlir/lib/Dialect/XeGPU/Transforms/XeGPUPropagateLayout.cpp
+48-0mlir/test/Dialect/XeGPU/propagate-layout.mlir
+15-5mlir/include/mlir/Dialect/XeGPU/Transforms/XeGPULayoutImpl.h
+16-3mlir/test/Dialect/XeGPU/propagate-layout-subgroup.mlir
+6-6mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+224-481 files not shown
+227-537 files

LLVM/project 6a27378clang-tools-extra/clang-doc Representation.cpp Representation.h, clang-tools-extra/unittests/clang-doc MergeTest.cpp ClangDocTest.cpp

[clang-doc] Migrate Namespaces to arena allocation

This patch allocates the NamespaceInfo types in the local arenas, and
adapts the merging logic for the new list type and its children.
Memory use and performance improve slightly. Micro-benchmarks show a
regression in merge operations due to the more complex list operations.

 ## Build Clang-Doc Documentation
| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 1009.2s | 1002.4s | +8.9% | -0.7% |
| Memory | 86.0G | 43.2G | 43.9G | -49.0% | +1.6% |

 ## Microbenchmarks (Filtered for >1% Delta)
| Benchmark | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| BM_BitcodeReader_Scale/10 | 67.9us | 69.7us | 69.3us | +1.9% | -0.7% |
| BM_BitcodeReader_Scale/10000 | 70.5ms | 22.3ms | 24.8ms | -64.8% | +11.4% |
| BM_BitcodeReader_Scale/4096 | 23.2ms | 4.7ms | 4.4ms | -80.9% | -5.7% |

    [22 lines not shown]
DeltaFile
+26-1clang-tools-extra/clang-doc/Representation.cpp
+8-8clang-tools-extra/unittests/clang-doc/MergeTest.cpp
+8-2clang-tools-extra/clang-doc/Representation.h
+7-3clang-tools-extra/unittests/clang-doc/ClangDocTest.cpp
+6-3clang-tools-extra/clang-doc/JSONGenerator.cpp
+4-4clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
+59-217 files not shown
+83-3713 files