LLVM/project 216b61dllvm/include/llvm/Object SFrameParser.h, llvm/lib/BinaryFormat SFrame.cpp

Revert "[Object] Parsing and dumping of SFrame Frame Row Entries (#151301)"

This reverts commit a82ca1b5603a4ed9598b784f703d908f32e970b8.
DeltaFile
+0-284llvm/test/tools/llvm-readobj/ELF/sframe-fre.test
+4-128llvm/lib/Object/SFrameParser.cpp
+0-51llvm/include/llvm/Object/SFrameParser.h
+1-28llvm/tools/llvm-readobj/ELFDumper.cpp
+2-8llvm/test/tools/llvm-readobj/ELF/sframe-fde.test
+0-8llvm/lib/BinaryFormat/SFrame.cpp
+7-5071 files not shown
+7-5087 files

LLVM/project 7074471llvm/test/Transforms/LoopVectorize/RISCV interleaved-accesses.ll reductions.ll

[RISCV] Enable tail folding by default (#151681)

We have been tracking the performance of EVL tail folding in the loop
vectorizer on RISC-V for a while now, and after much hard work from
various contributors we think it should be generally profitable to
enable by default now.

With tail folding there is a 21% improvement on 525.x264_r on SPEC CPU
2017 on the BPI-F3 (-march=rva22u64_v -O3 -flto), as well as a 30%
geomean codesize reduction on SPEC and TSVC, with no significant
regressions detected.

Now that we are early into the LLVM 22.x development cycle it seems like
a good time to enable it to catch any issues. There are still more EVL
related items of work being tracked in #123069, which should continue to
improve performance.
DeltaFile
+728-566llvm/test/Transforms/LoopVectorize/RISCV/interleaved-accesses.ll
+414-366llvm/test/Transforms/LoopVectorize/RISCV/reductions.ll
+294-256llvm/test/Transforms/LoopVectorize/RISCV/blocks-with-dead-instructions.ll
+224-226llvm/test/Transforms/LoopVectorize/RISCV/divrem.ll
+222-222llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse.ll
+182-177llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll
+2,064-1,81328 files not shown
+3,406-3,03334 files

LLVM/project a82ca1bllvm/include/llvm/Object SFrameParser.h, llvm/lib/BinaryFormat SFrame.cpp

[Object] Parsing and dumping of SFrame Frame Row Entries (#151301)

The trickiest part here is that the FREs have a variable size, in two
(or three?) dimensions:
- the size of the StartAddress field. This determined by the FDE they
are in, so it is uniform across all FREs in one FDE.
- the number and sizes of offsets following the FRE. This can be
different for each FRE.

While vending this information through a template API would be possible,
I believe such an approach would be very unwieldy, and it would still
require a sequential scan through the FRE list. This is why I'm
implementing this by reading the data into a common data structure using
the fallible iterator pattern.

For more information about the SFrame unwind format, see the
[specification](https://sourceware.org/binutils/wiki/sframe) and the
related
[RFC](https://discourse.llvm.org/t/rfc-adding-sframe-support-to-llvm/86900).
DeltaFile
+284-0llvm/test/tools/llvm-readobj/ELF/sframe-fre.test
+128-4llvm/lib/Object/SFrameParser.cpp
+51-0llvm/include/llvm/Object/SFrameParser.h
+28-1llvm/tools/llvm-readobj/ELFDumper.cpp
+8-2llvm/test/tools/llvm-readobj/ELF/sframe-fde.test
+8-0llvm/lib/BinaryFormat/SFrame.cpp
+507-71 files not shown
+508-77 files

LLVM/project 184b0aeclang/test/OpenMP target_dyn_groupprivate_messages.cpp target_teams_dyn_groupprivate_messages.cpp, offload/test/offloading dyn_groupprivate_strict.cpp

[OpenMP][Offload] Add tests for dyn_groupprivate
DeltaFile
+140-0offload/test/offloading/dyn_groupprivate_strict.cpp
+87-0clang/test/OpenMP/target_dyn_groupprivate_messages.cpp
+87-0clang/test/OpenMP/target_teams_dyn_groupprivate_messages.cpp
+314-03 files

LLVM/project 1a54d76clang/include/clang/AST OpenMPClause.h, clang/include/clang/Sema SemaOpenMP.h

[OpenMP][Offload] Add support for cgroup modifier in dyn_groupprivate
DeltaFile
+129-37clang/include/clang/AST/OpenMPClause.h
+44-23clang/lib/Sema/SemaOpenMP.cpp
+27-25clang/lib/Parse/ParseOpenMP.cpp
+10-8clang/lib/Sema/TreeTransform.h
+10-6clang/lib/AST/OpenMPClause.cpp
+4-5clang/include/clang/Sema/SemaOpenMP.h
+224-1048 files not shown
+244-12014 files

LLVM/project eb44bbaclang/include/clang/AST OpenMPClause.h, clang/lib/Parse ParseOpenMP.cpp

[OpenMP][Offload] Add support for dyn_groupprivate clause
DeltaFile
+59-20offload/plugins-nextgen/common/src/PluginInterface.cpp
+63-0clang/include/clang/AST/OpenMPClause.h
+52-4offload/DeviceRTL/src/State.cpp
+51-0clang/lib/Sema/SemaOpenMP.cpp
+35-2clang/lib/Parse/ParseOpenMP.cpp
+35-0llvm/include/llvm/Frontend/OpenMP/OMP.td
+295-2635 files not shown
+599-8241 files

LLVM/project eccc6e2clang/include/clang/Interpreter RemoteJITUtils.h, clang/lib/Interpreter RemoteJITUtils.cpp

[clang-repl] Enable extending `launchExecutor` (#152562)

This patch introduces the ability to customize the fork process with an external lambda function. This is useful for downstream clients where they want to do stream redirection.
DeltaFile
+148-0clang/unittests/Interpreter/OutOfProcessInterpreterTests.cpp
+30-1clang/lib/Interpreter/RemoteJITUtils.cpp
+21-1clang/unittests/Interpreter/CMakeLists.txt
+11-1clang/include/clang/Interpreter/RemoteJITUtils.h
+11-0clang/unittests/Interpreter/InterpreterTest.cpp
+221-35 files

LLVM/project 6a425f1llvm/lib/Target/ARM ARMISelLowering.cpp ARMISelLowering.h, llvm/test/CodeGen/ARM scmp.ll ucmp.ll

[ARM] Have custom lowering for ucmp and scmp (#149315)

Limited to non-thumb1 for scmp at the moment, since there is no good way
to do it.
DeltaFile
+379-110llvm/test/CodeGen/Thumb/scmp.ll
+335-110llvm/test/CodeGen/Thumb/ucmp.ll
+140-0llvm/lib/Target/ARM/ARMISelLowering.cpp
+24-24llvm/test/CodeGen/ARM/scmp.ll
+12-24llvm/test/CodeGen/ARM/ucmp.ll
+3-0llvm/lib/Target/ARM/ARMISelLowering.h
+893-2686 files

LLVM/project 8776323clang/docs PointerAuthentication.rst

[NFC][Clang][Docs] Update Pointer Authentication documentation

This updates the pointer authentication documentation to include
a complete description of the existing functionaliy and behaviour,
details of the more complex aspects of the semantics and security
properties, and the Apple arm64e ABI design.

Co-authored-By: Ahmed Bougacha <ahmed at bougacha.org>
Co-authored-By: Akira Hatanaka <ahatanak at gmail.com>
Co-authored-By: John Mccall <rjmccall at apple.com>
DeltaFile
+1,117-30clang/docs/PointerAuthentication.rst
+1,117-301 files

LLVM/project 097b92dclang/docs PointerAuthentication.rst

[NFC][Clang][Docs] Update Pointer Authentication documentation

This updates the pointer authentication documentation to include
a complete description of the existing functionaliy and behaviour,
details of the more complex aspects of the semantics and security
properties, and the Apple arm64e ABI design.

Co-authored-By: Ahmed Bougacha <ahmed at bougacha.org>
Co-authored-By: Akira Hatanaka <ahatanak at gmail.com>
Co-authored-By: John Mccall <rjmccall at apple.com>
DeltaFile
+1,107-20clang/docs/PointerAuthentication.rst
+1,107-201 files

LLVM/project 0bdd312llvm/test/CodeGen/AMDGPU llvm.amdgcn.kill.ll wqm.mir

[AMDGPU] Generate some WQM/WWM tests (NFC) (#152635)

Update llvm.amdgcn.kill.ll and wqm.mir to be generated.
This preparatory work for refactoring of WQM/WWM pass.
DeltaFile
+1,440-84llvm/test/CodeGen/AMDGPU/llvm.amdgcn.kill.ll
+223-54llvm/test/CodeGen/AMDGPU/wqm.mir
+1,663-1382 files

LLVM/project 2d4bac8clang-tools-extra/clang-tidy/bugprone NarrowingConversionsCheck.cpp NarrowingConversionsCheck.h, clang-tools-extra/docs ReleaseNotes.rst

 Reland "[clang-tidy] fix bugprone-narrowing-conversions false positive for conditional expression" (#151874)

This is another attempt to merge previously
[reverted](https://github.com/llvm/llvm-project/pull/139474#issuecomment-3148339124)
PR #139474. The added tests
`narrowing-conversions-conditional-expressions.c[pp]` failed on
[different (non x86_64)
platforms](https://github.com/llvm/llvm-project/pull/139474#issuecomment-3148334280)
because the expected warning is implementation-defined. That's why the
test must explicitly specify target (the line `// RUN: -- -target
x86_64-unknown-linux`).
DeltaFile
+22-0clang-tools-extra/test/clang-tidy/checkers/bugprone/narrowing-conversions-conditional-expressions.c
+22-0clang-tools-extra/test/clang-tidy/checkers/bugprone/narrowing-conversions-conditional-expressions.cpp
+11-4clang-tools-extra/clang-tidy/bugprone/NarrowingConversionsCheck.cpp
+4-0clang-tools-extra/docs/ReleaseNotes.rst
+2-0clang-tools-extra/clang-tidy/bugprone/NarrowingConversionsCheck.h
+61-45 files

LLVM/project 2422972libc/test/UnitTest FPMatcher.h FEnvSafeTest.cpp, utils/bazel/llvm-project-overlay/libc/test/UnitTest BUILD.bazel

[libc] Migrate FEnvSafeTest and FPTest to ErrnoCheckingTest. (#152633)

This would ensure that errno value is cleared out before test execution
and tests pass even when LIBC_ERRNO_MODE_SYSTEM_INLINE is specified (and
errno may be clobbered before test execution).

A lot of the tests would fail, however, since errno would end up getting
set to EDOM or ERANGE during test execution and never validated before
the end of the test. This should be fixed - and errno should be
explicitly checked or ignored in all of those cases, but for now add a
TODO to address it later (see open issue #135320) and clear out errno in
test fixture to avoid test failures.
DeltaFile
+10-1libc/test/UnitTest/FPMatcher.h
+4-2utils/bazel/llvm-project-overlay/libc/test/UnitTest/BUILD.bazel
+6-0libc/test/UnitTest/FEnvSafeTest.cpp
+2-1libc/test/UnitTest/FEnvSafeTest.h
+1-0libc/test/UnitTest/CMakeLists.txt
+23-45 files

LLVM/project 856a8b5mlir/include/mlir/Dialect/Linalg/TransformOps LinalgTransformOps.td, mlir/lib/Dialect/Linalg/TransformOps LinalgTransformOps.cpp

[mlir][linalg] Add mixed precision folding pattern in vectorize_children_and_apply_patterns TD Op (#148684)

In case of mixed precision inputs, the inputs are generally casted to
match output type thereby introduces arith.extFOp/extIOp instructions.

Folding such pattern into vector.contract is desirable for HW having
mixed precision ISA support.

This patch adds folding of mixed precision pattern into vector.contract
optionaly which can be enabled using attribute
`fold_type_extensions_into_contract`.
DeltaFile
+393-272mlir/test/Dialect/Linalg/vectorization/linalg-ops-with-patterns.mlir
+11-1mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
+5-0mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
+409-2733 files

LLVM/project 0720af8llvm/test/Transforms/LoopVectorize/RISCV inloop-reduction.ll bf16.ll

[LV][RISCV] Precommit RUN line changes from #151681. NFC

In preparation for enabling EVL tail folding by default.
DeltaFile
+2-2llvm/test/Transforms/LoopVectorize/RISCV/inloop-reduction.ll
+1-0llvm/test/Transforms/LoopVectorize/RISCV/bf16.ll
+1-0llvm/test/Transforms/LoopVectorize/RISCV/f16.ll
+4-23 files

LLVM/project 92ac1aclldb/source/Core DynamicLoader.cpp

[lldb] Fix incorrect print of UUID and load address (#152560)

The current display is missing a space, for example:
```
no target │ Locating binary: 24906A83-0182-361B-8B4A-90A249B04FD7at 0x0000000c0d108000
```

Co-authored-by: Dominic Chen <daming_chen at apple.com>
DeltaFile
+1-1lldb/source/Core/DynamicLoader.cpp
+1-11 files

LLVM/project d7d0d7aflang/lib/Lower Bridge.cpp, flang/test/Lower do_loop_unstructured.f90

[flang] Skip processing reductions for unstructured `do concurrent` loops (#150188)

Fixes #149563

When emitting unstructured `do concurrent` loops, reduction processing
should be skipped since we are not emitting `fir.do_concurrent` loop in
the first place.
DeltaFile
+19-0flang/test/Lower/do_loop_unstructured.f90
+3-0flang/lib/Lower/Bridge.cpp
+22-02 files

LLVM/project 15a705dlibc/src/math roundbf16.h roundevenbf16.h, libc/src/math/generic CMakeLists.txt

[libc][math][c++23] Add {ceil,floor,round,roundeven,trunc}bf16 math functions (#152352)

This PR implements the following basic math functions for BFloat16 type
along with the tests:
- ceilbf16
- floorbf16
- roundbf16
- roundevenbf16
- truncbf16

---------

Signed-off-by: Krishna Pandey <kpandey81930 at gmail.com>
DeltaFile
+80-0libc/src/math/generic/CMakeLists.txt
+65-0libc/test/src/math/smoke/CMakeLists.txt
+22-0libc/src/math/roundbf16.h
+21-0libc/src/math/roundevenbf16.h
+21-0libc/src/math/truncbf16.h
+21-0libc/src/math/floorbf16.h
+230-028 files not shown
+501-834 files

LLVM/project 7d886famlir/include/mlir/Dialect/GPU/IR GPUOps.td, mlir/lib/Dialect/GPU/IR GPUDialect.cpp

[mlir][gpu] Update attribute definitions in `gpu::LaunchOp` (#152106)

`gpu::LaunchOp` is updated the following way:
- Change the attribute type of kernel function and module from
`SymbolRefAttr` to `FlatSymbolRefAttr` to avoid nested symbol
references.
- Rename variables from camel case (kernelFunc, kernelModule) to lower
case (function, module) and update the syntax.
- `LaunchOp::build` support passing `module` and `function` attributes.
DeltaFile
+49-2mlir/lib/Dialect/GPU/IR/GPUDialect.cpp
+17-13mlir/test/Dialect/GPU/outlining.mlir
+16-4mlir/include/mlir/Dialect/GPU/IR/GPUOps.td
+12-0mlir/test/Dialect/GPU/ops.mlir
+4-5mlir/lib/Dialect/GPU/Transforms/KernelOutlining.cpp
+98-245 files

LLVM/project ffdaf85lld/ELF BPSectionOrderer.cpp, lld/test/ELF bp-section-orderer.s

[lld][ELF] filter out section symbols when use BP reorder (#151685)

When using Temporal Profiling with the BP algorithm, we encounter an
issue with the internal function reorder. In cases where the symbol
table contains entries like:
```
Symbol table '.symtab' contains 45 entries:
   Num:    Value          Size Type    Bind   Vis       Ndx Name
    10: 0000000000000000     0 SECTION LOCAL  DEFAULT    18 .text.L1
    11: 0000000000000000    24 FUNC    LOCAL  DEFAULT    18 L1
````
The zero-sized section symbol .text.L1 gets stored in the secToSym map
first. However, when the function lookup searches for L1 (as seen in
[BPSectionOrdererBase.inc:191](https://github.com/llvm/llvm-project/blob/main/lld/include/lld/Common/BPSectionOrdererBase.inc#L191)),
it fails to find the correct entry in rootSymbolToSectionIdxs because
the section symbol has already claimed that slot.
This patch fixes the issue by skipping zero-sized symbols during the
addSections process, ensuring that function symbols are properly
registered for lookup.
DeltaFile
+67-23lld/test/ELF/bp-section-orderer.s
+6-4lld/ELF/BPSectionOrderer.cpp
+73-272 files

LLVM/project b9ca01bllvm/lib/Target/RISCV/Disassembler RISCVDisassembler.cpp

[RISCV] Move the decoder table for XCV, Xqci and XRivos from standard section to vendor section. NFC
DeltaFile
+3-3llvm/lib/Target/RISCV/Disassembler/RISCVDisassembler.cpp
+3-31 files

LLVM/project 3769ce0llvm/lib/Target/RISCV/MCTargetDesc RISCVAsmBackend.cpp, llvm/test/MC/RISCV align.s cfi-advance.s

MC: Refine ALIGN relocation conditions

Each section now tracks the index of the first linker-relaxable
fragment, enabling two changes:

* Delete redundant ALIGN relocations before the first linker-relaxable
  instruction in a section. The primary example is the offset 0
  R_RISCV_ALIGN relocation for a text section aligned by 4.
* For alignments larger than the NOP size after the first
  linker-relaxable instruction, ALIGN relocations are now generated, even in
  norelax regions. This fixes the issue #150159.

The new test llvm/test/MC/RISCV/Relocations/align-after-relax.s
verifies the required ALIGN in a norelax region following
linker-relaxable instructions.
By using a fragment index within the subsection (which is less than or
equal to the section's index), the implementation may generate redundant
ALIGN relocations in lower-numbered subsections before the first
linker-relaxable instruction.

    [30 lines not shown]
DeltaFile
+50-0llvm/test/MC/RISCV/Relocations/align-after-relax.s
+18-27llvm/test/MC/RISCV/align.s
+22-8llvm/lib/Target/RISCV/MCTargetDesc/RISCVAsmBackend.cpp
+15-13llvm/test/MC/RISCV/cfi-advance.s
+18-5llvm/test/MC/RISCV/align-option-relax.s
+23-0llvm/test/MC/RISCV/Relocations/align-norvc.s
+146-538 files not shown
+172-8514 files

LLVM/project c9f3a70llvm/include/llvm/TextAPI Architecture.def, llvm/unittests/TextAPI TextStubV5Tests.cpp

[TextAPI] Add riscv32 as a supported arch (#152619)

DeltaFile
+27-0llvm/unittests/TextAPI/TextStubV5Tests.cpp
+5-0llvm/include/llvm/TextAPI/Architecture.def
+32-02 files

LLVM/project 05dd957llvm/lib/Analysis DependenceAnalysis.cpp, llvm/test/Analysis/DDG basic-loopnest.ll

[DA] Fix the check between Subscript and Size after delinearization (#151326)

Delinearization provides two values: the size of the array, and the
subscript of the access. DA checks their validity (`0 <= subscript <
size`), with some special handling. In particular, to ensure `subscript
< size`, calculate the maximum value of `subscript - size` and check if
it is negative. There was an issue in its process: when `subscript -
size` is expressed as an affine format like `init + step * i`, the value
in the last iteration (`start + step * (num_iterations - 1)`) was
assumed to be the maximum value. This assumption is incorrect in the
following cases:

- When `step` is negative
- When the AddRec wraps

This patch introduces extra checks to ensure the sign of `step` and
verify the existence of nsw/nuw flags.

Also, `isKnownNonNegative(S - smax(1, Size))` was used as a regular

    [5 lines not shown]
DeltaFile
+82-3llvm/test/Analysis/DependenceAnalysis/DADelin.ll
+25-12llvm/lib/Analysis/DependenceAnalysis.cpp
+4-1llvm/test/Analysis/DDG/basic-loopnest.ll
+3-1llvm/test/Analysis/DependenceAnalysis/Coupled.ll
+114-174 files

LLVM/project af05428.ci generate_test_report_lib_test.py generate_test_report_lib.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6
DeltaFile
+274-5.ci/generate_test_report_lib_test.py
+132-21.ci/generate_test_report_lib.py
+4-2.ci/generate_test_report_github.py
+2-1.ci/utils.sh
+412-294 files

LLVM/project af808e6.ci generate_test_report_lib_test.py generate_test_report_lib.py

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.6

[skip ci]
DeltaFile
+212-5.ci/generate_test_report_lib_test.py
+118-17.ci/generate_test_report_lib.py
+330-222 files

LLVM/project 530b3fa.ci generate_test_report_lib_test.py generate_test_report_lib.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6
DeltaFile
+212-5.ci/generate_test_report_lib_test.py
+118-17.ci/generate_test_report_lib.py
+330-222 files

LLVM/project 27c5ba3.ci generate_test_report_lib_test.py generate_test_report_lib.py

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.6

[skip ci]
DeltaFile
+107-0.ci/generate_test_report_lib_test.py
+51-0.ci/generate_test_report_lib.py
+158-02 files

LLVM/project a33c377.ci generate_test_report_lib_test.py generate_test_report_lib.py

[𝘀𝗽𝗿] initial version

Created using spr 1.3.6
DeltaFile
+107-0.ci/generate_test_report_lib_test.py
+51-0.ci/generate_test_report_lib.py
+158-02 files

LLVM/project 9fbe3b6llvm/test/Analysis/CostModel/ARM cast.ll cast_ldst.ll, llvm/test/CodeGen/AMDGPU load-global-i8.ll

rebase

Created using spr 1.3.5-bogner
DeltaFile
+10,298-0llvm/test/CodeGen/Xtensa/atomic-rmw.ll
+2,025-3,649llvm/test/Analysis/CostModel/ARM/cast.ll
+1,525-2,819llvm/test/Analysis/CostModel/ARM/cast_ldst.ll
+3,001-1,193llvm/test/CodeGen/AMDGPU/load-global-i8.ll
+3,413-0llvm/test/MC/AMDGPU/gfx1250_asm_vop3cx.s
+3,413-0llvm/test/MC/Disassembler/AMDGPU/gfx1250_dasm_vop3cx.txt
+23,675-7,6612,133 files not shown
+137,026-47,7692,139 files