LLVM/project 4818924llvm/lib/Target/AMDGPU AMDGPU.td

[AMDGPU] Enable real true16 on gfx1250
DeltaFile
+1-0llvm/lib/Target/AMDGPU/AMDGPU.td
+1-01 files

LLVM/project 5a50b22llvm/test/MC/AMDGPU vop3-literal-gfx1250.s vop3-literal.s

[AMDGPU] Update vop3-literal.s to use fake16 on gfx1250. NFC

16-bit instructions there are in fake16 mode and shall also be
compatible with older targets. The purpose of the test is to
check literals, so fake16 or real16 is not important.
DeltaFile
+296-0llvm/test/MC/AMDGPU/vop3-literal-gfx1250.s
+3-3llvm/test/MC/AMDGPU/vop3-literal.s
+299-32 files

LLVM/project da0aec2clang/test/Driver solaris-ld-sld.c, clang/test/Driver/Inputs/fake_ld ld

[clang][test] Fix solaris ld driver test to not assume gnu ld location (#186250)
DeltaFile
+5-0clang/test/Driver/Inputs/fake_ld/ld
+2-2clang/test/Driver/solaris-ld-sld.c
+7-22 files

LLVM/project 093c639llvm/test/Transforms/LoopVectorize optimal-epilog-vectorization-liveout.ll

[LV] Add additional tests with IV live-outs. (NFC) (#190395)

Add additional tests with IV live-out users, for which epilogue
vectorization is not enabled yet.

Also modernize check lines.
DeltaFile
+357-73llvm/test/Transforms/LoopVectorize/optimal-epilog-vectorization-liveout.ll
+357-731 files

LLVM/project e0908cdllvm/test/CodeGen/AMDGPU integer-mad-patterns.ll fcanonicalize.bf16.ll

[AMDGPU] Specialize gfx1250 codegen tests for fake and real t16. NFC. (#190390)

This is preparation of turning on real true16, so we can easily
apply it or revert.
DeltaFile
+1,318-117llvm/test/CodeGen/AMDGPU/integer-mad-patterns.ll
+835-387llvm/test/CodeGen/AMDGPU/fcanonicalize.bf16.ll
+610-305llvm/test/CodeGen/AMDGPU/atomics-system-scope.ll
+505-259llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2bf16.ll
+460-214llvm/test/CodeGen/AMDGPU/load-constant-i1.ll
+384-170llvm/test/CodeGen/AMDGPU/flat-saddr-load.ll
+4,112-1,45227 files not shown
+6,175-2,28333 files

LLVM/project bc3386cllvm/test/tools/dsymutil/AArch64 pseudo-probe.test, llvm/tools/dsymutil MachOUtils.cpp

[dsymutil] Add support for pseudo probes (#186877)

This patch teaches dsymutil to transfer the `__PSEUDO_PROBE` segment and
`__probes` and `__probe_descs` section when creating dSYMs. Without this, both probe sections will get an invalid offset in the dsym bundle.
DeltaFile
+315-0llvm/test/tools/dsymutil/AArch64/pseudo-probe.test
+65-7llvm/tools/dsymutil/MachOUtils.cpp
+380-72 files

LLVM/project 38f8945mlir/lib/Analysis/Presburger Matrix.cpp

[MLIR][Presburger] Fix stale pivot in Smith normal form (#189789)

The pivot used to fix divisibility in Smith normal form is stale. This
will not affect correctness, but can lower efficiency since the outer
loop will be executed more times.

Thanks for @benquike of discovering this.
DeltaFile
+1-3mlir/lib/Analysis/Presburger/Matrix.cpp
+1-31 files

LLVM/project e6e388cllvm/include/llvm/MC TargetRegistry.h MCLFI.h, llvm/lib/MC MCLFI.cpp MCELFStreamer.cpp

[LFI][MC] Call setLFIRewriter during LFIMCStreamer initialization (#188625)

Calls `Streamer.setLFIRewriter` during generic LFIMCStreamer
initialization rather than requiring it to be done during
backend-specific initialization. This better follows the existing
conventions in `create*` functions in `TargetRegistry.h`.

Also re-adds the call to initSections for LFI in `llvm-mc.cpp`
(necessary in order to emit the ABI Note section), along with a test to
make sure ABI note emission with the rewriter is working.
DeltaFile
+20-14llvm/lib/MC/MCLFI.cpp
+15-0llvm/test/MC/AArch64/LFI/abi-note.s
+6-5llvm/include/llvm/MC/TargetRegistry.h
+4-0llvm/lib/MC/MCELFStreamer.cpp
+4-0llvm/lib/MC/MCAsmStreamer.cpp
+2-0llvm/include/llvm/MC/MCLFI.h
+51-196 files

LLVM/project 3080198clang/lib/Sema SemaLookup.cpp SemaDeclCXX.cpp, clang/test/SemaCXX using-if-exists.cpp

Revert "[clang] Fix conflicting declaration error with using_if_exists" (#190441)

Reverts llvm/llvm-project#167646
DeltaFile
+28-53clang/test/SemaCXX/using-if-exists.cpp
+4-29clang/lib/Sema/SemaLookup.cpp
+12-4clang/lib/Sema/SemaDeclCXX.cpp
+44-863 files

LLVM/project 8e79157clang-tools-extra/clang-doc JSONGenerator.cpp MDMustacheGenerator.cpp, clang-tools-extra/clang-doc/support Utils.h Utils.cpp

[clang-doc] Fix file header style

Since we're fixing headers, we can also improve the file documentation
to follow the llvm coding standard.
DeltaFile
+16-0clang-tools-extra/clang-doc/JSONGenerator.cpp
+8-0clang-tools-extra/clang-doc/support/Utils.h
+6-0clang-tools-extra/clang-doc/support/Utils.cpp
+6-0clang-tools-extra/clang-doc/support/File.h
+6-0clang-tools-extra/clang-doc/support/File.cpp
+4-1clang-tools-extra/clang-doc/MDMustacheGenerator.cpp
+46-16 files

LLVM/project 5cd98f9llvm/lib/Target/RISCV RISCVInstrInfo.td RISCVInstrInfoP.td, llvm/test/CodeGen/RISCV rvp-ext-rv64.ll rvp-ext-rv32.ll

[RISCV] Select add(vec, splat(scalar)) to PADD_*S for P extension (#190303)
DeltaFile
+73-6llvm/test/CodeGen/RISCV/rvp-ext-rv64.ll
+49-4llvm/test/CodeGen/RISCV/rvp-ext-rv32.ll
+5-0llvm/lib/Target/RISCV/RISCVInstrInfo.td
+3-0llvm/lib/Target/RISCV/RISCVInstrInfoP.td
+130-104 files

LLVM/project 6f68502mlir/test/Dialect/XeGPU sg-to-wi-experimental.mlir

[MLIR][XeGPU] Port tests from the XeGPUSubgroupDistribute to XeGPUSgToWiDistributeExperimental (#189747)

This PR ports tests from subgroup-distribute.mlir (old pass) to
sg-to-wi-experimental.mlir (new pass)
DeltaFile
+297-1mlir/test/Dialect/XeGPU/sg-to-wi-experimental.mlir
+297-11 files

LLVM/project 7453db1llvm/test/CodeGen/AMDGPU memintrinsic-unroll.ll memory-legalizer-private-workgroup.ll, llvm/test/CodeGen/X86 vector-interleaved-load-i64-stride-7.ll

Address review comments

Created using spr 1.3.6-beta.1
DeltaFile
+6,835-6,798llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+6,432-6,562llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-7.ll
+8,836-1,658llvm/test/CodeGen/AMDGPU/memory-legalizer-private-workgroup.ll
+8,836-1,658llvm/test/CodeGen/AMDGPU/memory-legalizer-private-singlethread.ll
+8,836-1,658llvm/test/CodeGen/AMDGPU/memory-legalizer-private-wavefront.ll
+8,737-1,643llvm/test/CodeGen/AMDGPU/memory-legalizer-private-agent.ll
+48,512-19,9778,821 files not shown
+674,703-253,8688,827 files

LLVM/project 999ce11llvm/test/CodeGen/AMDGPU memintrinsic-unroll.ll memory-legalizer-private-singlethread.ll, llvm/test/CodeGen/X86 vector-interleaved-load-i64-stride-7.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.6-beta.1

[skip ci]
DeltaFile
+6,835-6,798llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+6,432-6,562llvm/test/CodeGen/X86/vector-interleaved-load-i64-stride-7.ll
+8,836-1,658llvm/test/CodeGen/AMDGPU/memory-legalizer-private-singlethread.ll
+8,836-1,658llvm/test/CodeGen/AMDGPU/memory-legalizer-private-workgroup.ll
+8,836-1,658llvm/test/CodeGen/AMDGPU/memory-legalizer-private-wavefront.ll
+8,737-1,643llvm/test/CodeGen/AMDGPU/memory-legalizer-private-cluster.ll
+48,512-19,9778,820 files not shown
+674,690-253,8648,826 files

LLVM/project b5936d4mlir/lib/Dialect/XeGPU/Transforms XeGPUSgToWiDistributeExperimental.cpp

[MLIR][XeGPU] Remove verifyLayouts from sg to wi pass (#190360)

The verifyLayouts function walked the IR before distribution and failed
the pass if any XeGPU anchor op or vector-typed result was missing a
layout attribute. This was added as a temporary guard while the pass was
being developed.
Now we add target check for each op.
DeltaFile
+0-36mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToWiDistributeExperimental.cpp
+0-361 files

LLVM/project 9c0a9bbmlir/lib/Dialect/XeGPU/Transforms XeGPUSgToWiDistributeExperimental.cpp, mlir/test/Dialect/XeGPU sg-to-wi-experimental-unit.mlir

[MLIR][XeGPU] Add support for reducing to scalar in sg to wi pass (#190193)
DeltaFile
+32-0mlir/test/Dialect/XeGPU/sg-to-wi-experimental-unit.mlir
+18-1mlir/lib/Dialect/XeGPU/Transforms/XeGPUSgToWiDistributeExperimental.cpp
+50-12 files

LLVM/project 46411f3lldb/test/Shell/Platform/AutoLoad/Darwin dsym-python-script.test

[lldb] Update dsym-python-script.test for #190407 (#190432)
DeltaFile
+1-1lldb/test/Shell/Platform/AutoLoad/Darwin/dsym-python-script.test
+1-11 files

LLVM/project 00d7134flang/lib/Semantics check-cuda.cpp, flang/test/Semantics cuf25.cuf

[flang][cuda] Do not flag dummy arg component as host array (#190431)
DeltaFile
+14-0flang/test/Semantics/cuf25.cuf
+4-1flang/lib/Semantics/check-cuda.cpp
+18-12 files

LLVM/project f507946clang-tools-extra/clang-doc Serialize.cpp Representation.cpp, clang-tools-extra/clang-doc/benchmarks ClangDocBenchmark.cpp

[clang-doc] Use distinct APIs for fixed arena allocation sites

Typically, code either always emits data into the TransientArena or the
PersistentArena. Use more explicit APIs to convey the intent directly
instead of relying on parameters or defaults.
DeltaFile
+18-17clang-tools-extra/clang-doc/Serialize.cpp
+7-8clang-tools-extra/clang-doc/Representation.cpp
+7-7clang-tools-extra/clang-doc/BitcodeReader.cpp
+8-3clang-tools-extra/clang-doc/Representation.h
+3-3clang-tools-extra/clang-doc/benchmarks/ClangDocBenchmark.cpp
+43-385 files

LLVM/project 76a8047clang-tools-extra/clang-doc Representation.h

[clang-doc][nfc] Reformat and revise comment block
DeltaFile
+9-12clang-tools-extra/clang-doc/Representation.h
+9-121 files

LLVM/project 44f5353clang-tools-extra/clang-doc Representation.h Representation.cpp, clang-tools-extra/clang-doc/benchmarks ClangDocBenchmark.cpp

[clang-doc] Update type aliases

Many of the type aliases we introduced to simplify migration to arena
allocation  are no longer relevant after completing the migration. We
can use more relevant names and remove dead aliases.
DeltaFile
+13-25clang-tools-extra/clang-doc/Representation.h
+9-9clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp
+4-4clang-tools-extra/unittests/clang-doc/ClangDocTest.cpp
+4-4clang-tools-extra/unittests/clang-doc/MergeTest.cpp
+4-4clang-tools-extra/clang-doc/Representation.cpp
+4-4clang-tools-extra/clang-doc/benchmarks/ClangDocBenchmark.cpp
+38-505 files not shown
+47-5911 files

LLVM/project 5bc2129clang-tools-extra/clang-doc Serialize.cpp Serialize.h

[clang-doc] Removed OwnedPtr alias

The alias served a purpose during migration, but now conveys the wrong
semantics, as the memory of these pointers is generally interned inside
a local arena.
DeltaFile
+40-38clang-tools-extra/clang-doc/Serialize.cpp
+29-42clang-tools-extra/clang-doc/Serialize.h
+17-18clang-tools-extra/clang-doc/Representation.cpp
+3-14clang-tools-extra/clang-doc/Representation.h
+8-8clang-tools-extra/clang-doc/JSONGenerator.cpp
+8-6clang-tools-extra/clang-doc/HTMLGenerator.cpp
+105-12610 files not shown
+145-16616 files

LLVM/project e303eefclang-tools-extra/clang-doc Representation.cpp Representation.h

[clang-doc] Support deep copy between arenas for merging

Upcoming changes to the merge step will necessitate that we clear the
transient arenas and merge new items into the persistent arena. However
there are some challenges with that, as the existing types typically
don't want to be copied. We introduce some new APIs to simplify that
task and ensure we don't accidentally leak memory.

On the performance front, we reclaim about 2% of the overhead, bringing
the cumulative overhead from the series of patches down to about 7% over
the baseline.

| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 1014.5s | 991.5s | +7.7% | -2.3% |
| Memory | 86.0G | 39.9G | 40.0G | -53.4% | +0.3% |

| Benchmark | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |

    [28 lines not shown]
DeltaFile
+140-21clang-tools-extra/clang-doc/Representation.cpp
+30-0clang-tools-extra/clang-doc/Representation.h
+170-212 files

LLVM/project 73ba5f6clang-tools-extra/clang-doc BitcodeReader.cpp Serialize.cpp, clang-tools-extra/unittests/clang-doc SerializeTest.cpp MergeTest.cpp

[clang-doc] Move Info types into arenas

Info types used to own significant chunks of data. As we move these into
local arenas, these types must be trivially destructible, to avoid
leaking resources when the arena is reset. Unfortunaly, there isn't a
good way to transition all the data types one at a time, since most of
them are tied together in some way. Further, as they're now allocated in
the arenas, they often cannot be treated the same way, and even the
aliases and interfaces put in pLace to simplify the transition cannot
cover the full range of changes required.

We also use some SFINAE tricks to avoid adding boilerplate for helper
APIs, we'd otherwise ahve to support

Though it introduces some additional churn, we also try to keep tests
from using arena allocation as much as possible, since this is not
required to test the implementation of the library. As much of the test
code needed to be rewritten anyway, we take the opportunity to
transition now.

    [41 lines not shown]
DeltaFile
+419-187clang-tools-extra/clang-doc/BitcodeReader.cpp
+246-189clang-tools-extra/unittests/clang-doc/SerializeTest.cpp
+196-129clang-tools-extra/unittests/clang-doc/MergeTest.cpp
+176-80clang-tools-extra/unittests/clang-doc/ClangDocTest.cpp
+137-75clang-tools-extra/clang-doc/Serialize.cpp
+71-41clang-tools-extra/unittests/clang-doc/YAMLGeneratorTest.cpp
+1,245-70114 files not shown
+1,649-94320 files

LLVM/project 47b99e5clang-tools-extra/clang-doc Representation.cpp Representation.h, clang-tools-extra/clang-doc/tool ClangDocMain.cpp

[clang-doc] Merge data into persistent memory

We have a need for persistent memory for the final info. Since each
group processes a single USR at a time, every USR is only ever processed by
a single thread from the thread pool. This means that we can keep per
thread persistent storage for all the info. There is significant
duplicated data between all the serialized records, so we can just merge
the final/unique items into the persistent arena, and clear out the
scratch/transient arena as we process each record in the bitcode.

The patch adds some APIs to help with managing the data, merging, and
allocation of data in the correct arena. It also safely merges and deep
copies data from the transient arenas into persistent storage that is
never reset until the program completes.

This patch reduces memory by another % over the previous patches,
bringing the total savings over the baseline to 57%. Runtime performance
and benchmarks stay mostly flat with modest improvements.


    [31 lines not shown]
DeltaFile
+134-10clang-tools-extra/clang-doc/Representation.cpp
+25-25clang-tools-extra/clang-doc/tool/ClangDocMain.cpp
+7-0clang-tools-extra/clang-doc/Representation.h
+166-353 files

LLVM/project db0b28eclang-tools-extra/clang-doc BitcodeReader.cpp Serialize.cpp, clang-tools-extra/unittests/clang-doc MDGeneratorTest.cpp BitcodeTest.cpp

[clang-doc] Make CommentInfo arena allocated

This patch move the CommentInfo type into the arena. It updates block
handling to collect child info types and serialize the array in one
shot.

We also clean up the test code to avoid using the arenas in the tests.
This has the upside of making the test more hermetic, and avoids churn
in the related code as the allocation API interfaces evolve.

Performance and memory usage regress slightly. This is somewhat expected
as we do not yet aggressively release short term memory during merge
operations. Future patches will reclaim this overhead.

| Metric | Baseline | Prev | This | Culm% | Seq% |
| :--- | :--- | :--- | :--- | :--- | :--- |
| Time | 920.5s | 998.5s | 1010.5s | +9.8% | +1.2% |
| Memory | 86.0G | 43.8G | 47.8G | -44.4% | +9.2% |


    [26 lines not shown]
DeltaFile
+124-94clang-tools-extra/unittests/clang-doc/MDGeneratorTest.cpp
+70-111clang-tools-extra/unittests/clang-doc/BitcodeTest.cpp
+66-103clang-tools-extra/unittests/clang-doc/YAMLGeneratorTest.cpp
+15-30clang-tools-extra/unittests/clang-doc/MergeTest.cpp
+17-9clang-tools-extra/clang-doc/BitcodeReader.cpp
+15-5clang-tools-extra/clang-doc/Serialize.cpp
+307-3526 files not shown
+345-37012 files

LLVM/project 1e13648clang-tools-extra/clang-doc Generators.h ClangDoc.cpp, clang-tools-extra/clang-doc/benchmarks ClangDocBenchmark.cpp

[clang-doc] Move non-arena allocated types off the OwnedPtr alias

Some types should not be using this alias, which was over applied to
APIs that wont participate in arena style allocation. This patch
restores them to their correct spelling.
DeltaFile
+7-7clang-tools-extra/clang-doc/Generators.h
+4-4clang-tools-extra/clang-doc/ClangDoc.cpp
+4-4clang-tools-extra/clang-doc/MDMustacheGenerator.cpp
+3-3clang-tools-extra/clang-doc/HTMLGenerator.cpp
+2-2clang-tools-extra/clang-doc/benchmarks/ClangDocBenchmark.cpp
+1-1clang-tools-extra/clang-doc/ClangDoc.h
+21-211 files not shown
+22-227 files

LLVM/project c22eb34clang-tools-extra/clang-doc BitcodeReader.cpp BitcodeReader.h

[clang-doc] Simplify parsing and reading bitcode blocks

Much of the logic int he readBlock implementation is boilerplate, and is
repeated for each implementation/specialization. This will become much
worse as we introduce new custom block reading logic as we migrate
towards arena allocation. In preparation for that, we're introducing the
change in logic now, which should make later refactoring much more
straightforward.
DeltaFile
+103-120clang-tools-extra/clang-doc/BitcodeReader.cpp
+5-0clang-tools-extra/clang-doc/BitcodeReader.h
+1-1clang-tools-extra/clang-doc/Representation.h
+109-1213 files

LLVM/project 2463196clang-tools-extra/clang-doc Representation.cpp

[clang-doc] Consolidate merging logic

As we migrate things in the arena, this logic may get more complex.
Factoring it out now, will give clear extension points to make this
easier to manage.
DeltaFile
+10-9clang-tools-extra/clang-doc/Representation.cpp
+10-91 files

LLVM/project d6534f4libc/src/signal/linux signal_utils.h

[libc][signal] remove wrongly added constexpr (#190424)

Hotfix for CI failure
DeltaFile
+1-1libc/src/signal/linux/signal_utils.h
+1-11 files