LLVM/project c13bf9ellvm/lib/Target/AMDGPU SIInstructions.td SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU insert_vector_dynelt.ll extract_vector_dynelt.ll

Reapply "[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST (#170323) (#171838)

A buildbot failed for the original patch.

https://github.com/llvm/llvm-project/pull/171835 addresses the issue
raised by the buildbot.
After the fix is merged, the original patch is reapplied without any
change.
DeltaFile
+5,963-0llvm/test/CodeGen/AMDGPU/insert_vector_dynelt.ll
+3,310-0llvm/test/CodeGen/AMDGPU/extract_vector_dynelt.ll
+16-0llvm/lib/Target/AMDGPU/SIInstructions.td
+8-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+9,297-04 files

LLVM/project e309272llvm/test/tools/llvm-mca/AArch64/Neoverse N2-writeback.s N1-writeback.s, llvm/test/tools/llvm-mca/ARM m85-int.s

[AArch64][ARM] Regenerate llvm-mca tests. NFC
DeltaFile
+52-52llvm/test/tools/llvm-mca/AArch64/Neoverse/N2-writeback.s
+5-5llvm/test/tools/llvm-mca/AArch64/Neoverse/N1-writeback.s
+1-1llvm/test/tools/llvm-mca/ARM/m85-int.s
+58-583 files

LLVM/project 1e9e389llvm/lib/Target/AArch64 AArch64ISelLowering.cpp

[AArch64] Add a performBICiCombine function.

This moves the code out of PerformDAGCombine directly, changing the return
to return SDValue(N, 0) to match other uses of SimplifyDemandedBits.
DeltaFile
+15-12llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+15-121 files

LLVM/project 6ff3df8libcxx/include unordered_set, libcxx/test/libcxx/diagnostics unordered_set.nodiscard.verify.cpp

[libc++][unordered_set] Applied `[[nodiscard]]` (#170435)

[[nodiscard]] should be applied to functions where discarding the return
value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/unord.set
DeltaFile
+89-12libcxx/test/libcxx/diagnostics/unordered_set.nodiscard.verify.cpp
+47-37libcxx/include/unordered_set
+136-492 files

LLVM/project e22ff9blibcxx/include unordered_set, libcxx/test/libcxx/diagnostics unordered_multiset.nodiscard.verify.cpp

[libc++][unordered_multiset] Applied `[[nodiscard]]` (#171664)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.htm
- https://wg21.link/unord.multiset
DeltaFile
+103-0libcxx/test/libcxx/diagnostics/unordered_multiset.nodiscard.verify.cpp
+47-37libcxx/include/unordered_set
+150-372 files

LLVM/project a5b7c42libcxx/include unordered_map, libcxx/test/libcxx/diagnostics unordered_multimap.nodiscard.verify.cpp

[libc++][unordered_multimap] Applied `[[nodiscard]]` (#171659)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.htm
- https://wg21.link/unord.multimap
DeltaFile
+101-0libcxx/test/libcxx/diagnostics/unordered_multimap.nodiscard.verify.cpp
+49-37libcxx/include/unordered_map
+150-372 files

LLVM/project ffaa6f2llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV sadd_sat.ll ssub_sat.ll

[RISCV] Custom legalize i32 saddo/ssubo on RV64 to return a sign extended value for the data result. (#172112)

This is consistent with how we handle regular ADD/SUB and helps with
computeNumSignBits optimizations.

Fixes #172089
DeltaFile
+29-22llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+6-7llvm/test/CodeGen/RISCV/sadd_sat.ll
+5-6llvm/test/CodeGen/RISCV/ssub_sat.ll
+5-5llvm/test/CodeGen/RISCV/ssub_sat_plus.ll
+5-5llvm/test/CodeGen/RISCV/sadd_sat_plus.ll
+50-455 files

LLVM/project 7fa062aclang/lib/AST ItaniumMangle.cpp, clang/test/CodeGenCXX riscv-mangle-rvv-fixed-vectors.cpp

[RISCV] Add BFloat16 to mangleRISCVFixedRVVVectorType. (#172095)

DeltaFile
+54-0clang/test/CodeGenCXX/riscv-mangle-rvv-fixed-vectors.cpp
+3-0clang/lib/AST/ItaniumMangle.cpp
+57-02 files

LLVM/project c878cf4llvm/include/llvm/CodeGen ISDOpcodes.h

[SelectionDAG] Consistently use doxygen comments in the NodeType enum. NFC (#172178)

DeltaFile
+63-63llvm/include/llvm/CodeGen/ISDOpcodes.h
+63-631 files

LLVM/project 61908c5orc-rt/include/orc-rt RTTI.h

[orc-rt] Prevent RTTIExtends from being used for errors. (#172250)

Custom error types (ErrorInfoBase subclasses) should use ErrorExtends as
of 8f51da369e6. Adding a static_assert allows us to enforce that at
compile-time.
DeltaFile
+8-0orc-rt/include/orc-rt/RTTI.h
+8-01 files

LLVM/project 4cf98d1llvm/docs MemProf.rst

Fix indentation.
DeltaFile
+4-4llvm/docs/MemProf.rst
+4-41 files

LLVM/project 5a581acclang/include/clang/CIR/Dialect/IR CIROps.td, clang/test/CIR/CodeGen switch.cpp

[CIR] Rename allEnumCasesCovered to all_enum_cases_covered (#172153)

Use the convetional snake_case for MLIR assembly and align with
operation documentation that already mentions snake_cased attribute.
DeltaFile
+2-2clang/include/clang/CIR/Dialect/IR/CIROps.td
+2-2clang/test/CIR/IR/switch.cir
+1-1clang/test/CIR/CodeGen/switch.cpp
+5-53 files

LLVM/project 35315a8offload/plugins-nextgen/cuda/dynamic_cuda cuda.h, offload/plugins-nextgen/cuda/src rtl.cpp

[offload] Fix CUDA args size by subtracting tail padding (#172249)

This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.
DeltaFile
+31-2offload/plugins-nextgen/cuda/src/rtl.cpp
+14-0offload/unittests/OffloadAPI/kernel/olLaunchKernel.cpp
+3-0offload/unittests/OffloadAPI/device_code/multiargs.cpp
+0-3offload/test/offloading/CUDA/basic_launch_multi_arg.cu
+2-0offload/unittests/OffloadAPI/device_code/CMakeLists.txt
+1-0offload/plugins-nextgen/cuda/dynamic_cuda/cuda.h
+51-51 files not shown
+52-57 files

LLVM/project 35b2317llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 neon-dotreduce.ll aarch64-matmul.ll

[AArch64] Support USDOT in performAddDotCombine (#171864)

This function does
// ADD(UDOT(zero, x, y), A) -->  UDOT(A, x, y)

Which can equally apply to USDOT too now that we have a node for it.
DeltaFile
+44-70llvm/test/CodeGen/AArch64/neon-dotreduce.ll
+38-2llvm/test/CodeGen/AArch64/aarch64-matmul.ll
+2-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+84-733 files

LLVM/project ff5209amlir/lib/Dialect/XeGPU/Transforms XeGPUSubgroupDistribute.cpp

only distribute function op
DeltaFile
+9-57mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp
+9-571 files

LLVM/project 1d821b0llvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/test/Analysis/CostModel/AArch64 shuffle-transpose.ll

[AArch64] use `isTRNMask` to calculate shuffle costs (#171524)

This builds on #169858 to fix the divergence in codegen
(https://godbolt.org/z/a9az3h6oq) between two very similar
functions initially observed in #137447 (represented in the diff by test
cases `@transpose_splat_constants` and `@transpose_constants_splat`:
```
int8x16_t f(int8_t x)
{
  return (int8x16_t) { x, 0, x, 1, x, 2, x, 3,
                       x, 4, x, 5, x, 6, x, 7 };
}

int8x16_t g(int8_t x)
{
  return (int8x16_t) { 0, x, 1, x, 2, x, 3, x,
                       4, x, 5, x, 6, x, 7, x };
}
```

    [7 lines not shown]
DeltaFile
+252-0llvm/test/Analysis/CostModel/AArch64/shuffle-transpose.ll
+47-0llvm/test/Transforms/SLPVectorizer/AArch64/transpose-with-constants.ll
+5-6llvm/test/Transforms/SLPVectorizer/AArch64/extractelements-to-shuffle.ll
+6-1llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+310-74 files

LLVM/project 8f51da3orc-rt/include/orc-rt Error.h, orc-rt/lib/executor Error.cpp CMakeLists.txt

[orc-rt] Add Error / Exception interop. (#172247)

The ORC runtime needs to work in diverse codebases, both with and
without C++ exceptions enabled (e.g. most LLVM projects compile with
exceptions turned off, but regular C++ codebases will typically have
them turned on). This introduces a tension in the ORC runtime: If a C++
exception is thrown (e.g. by a client-supplied callback) it can't be
ignored, but orc_rt::Error values will assert if not handled prior to
destruction. That makes the following pattern fundamentally unsafe in
the ORC runtime:

```
if (auto Err = orc_rt_operation(...)) {
  log("failure, bailing out"); // <- may throw if exceptions enabled
  // Exception unwinds stack before Error is handled, triggers Error-not-checked
  // assertion here.
  return Err;
}
```

    [29 lines not shown]
DeltaFile
+213-0orc-rt/unittests/ErrorExceptionInteropTest.cpp
+190-20orc-rt/include/orc-rt/Error.h
+48-0orc-rt/lib/executor/Error.cpp
+5-5orc-rt/unittests/ErrorTest.cpp
+1-0orc-rt/lib/executor/CMakeLists.txt
+1-0orc-rt/unittests/CMakeLists.txt
+458-256 files

LLVM/project c24f66eclang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/non-policy/overloaded vfneg.c vfabs.c, clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/policy/non-overloaded vfneg.c vfabs.c

[llvm][RISCV] Add bf16 vfabs and vfneg intrinsics for zvfbfa. (#172130)

These are pseudoinstruction aliases for vfsgnjx and vfsgnjn.

Co-authored-by: Craig Topper <craig.topper at sifive.com>
DeltaFile
+249-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/policy/overloaded/vfneg.c
+249-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/policy/overloaded/vfabs.c
+249-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/policy/non-overloaded/vfneg.c
+249-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/policy/non-overloaded/vfabs.c
+129-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/non-policy/overloaded/vfneg.c
+129-0clang/test/CodeGen/RISCV/rvv-intrinsics-autogenerated/zvfbfa/non-policy/overloaded/vfabs.c
+1,254-03 files not shown
+1,516-09 files

LLVM/project 9a03a30libcxx/include unordered_map, libcxx/test/libcxx/containers/unord/unord.map at.const.abort.pass.cpp at.abort.pass.cpp

[libc++][unordered_map] Applied `[[nodiscard]]` (#170423)

[[nodiscard]] should be applied to functions where discarding the return
value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/unord.map
DeltaFile
+94-12libcxx/test/libcxx/diagnostics/unordered_map.nodiscard.verify.cpp
+53-41libcxx/include/unordered_map
+1-1libcxx/test/libcxx/containers/unord/unord.map/at.const.abort.pass.cpp
+1-1libcxx/test/libcxx/containers/unord/unord.map/at.abort.pass.cpp
+149-554 files

LLVM/project 86a07e4llvm/lib/Target/AMDGPU VOP3PInstructions.td

[NFC][AMDGPU] Refactor the multiclass for WMMA_F8F6F4 instructions
DeltaFile
+34-13llvm/lib/Target/AMDGPU/VOP3PInstructions.td
+34-131 files

LLVM/project 7ac0177mlir/lib/CAPI/ExecutionEngine ExecutionEngine.cpp

[mlir][ExecutionEngine] Remove stderr printing when propagating errors (#171997)

DeltaFile
+0-4mlir/lib/CAPI/ExecutionEngine/ExecutionEngine.cpp
+0-41 files

LLVM/project 00b92e3orc-rt/include/orc-rt-c config.h.in

[orc-rt] Add config.h.in (missing from 7ccf968d0bf).

This file was accidentally left out of commit 7ccf968d0bf.
DeltaFile
+19-0orc-rt/include/orc-rt-c/config.h.in
+19-01 files

LLVM/project 274a44cclang/lib/Format ContinuationIndenter.cpp ContinuationIndenter.h, clang/unittests/Format FormatTest.cpp FormatTestObjC.cpp

Revert "[clang-format] Continue aligned lines without parentheses (#167979)"

This reverts commit 75c85bafb830e5a7bd7fda13d2648180538ff513.
DeltaFile
+36-85clang/lib/Format/ContinuationIndenter.cpp
+0-59clang/unittests/Format/FormatTest.cpp
+8-43clang/lib/Format/ContinuationIndenter.h
+19-24clang/lib/Format/WhitespaceManager.cpp
+0-26clang/unittests/Format/FormatTestObjC.cpp
+6-19clang/lib/Format/WhitespaceManager.h
+69-2561 files not shown
+72-2597 files

LLVM/project 59fb3bclibcxx/include/__utility pair.h, libcxx/test/libcxx/utilities/utility/pairs nodiscard.verify.cpp

[libc++][pair] Applied `[[nodiscard]]` (#171999)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/pairs
DeltaFile
+52-0libcxx/test/libcxx/utilities/utility/pairs/nodiscard.verify.cpp
+18-14libcxx/include/__utility/pair.h
+70-142 files

LLVM/project b6d940dlibcxx/include map, libcxx/test/libcxx/diagnostics multimap.nodiscard.verify.cpp

[libc++][multimap] Applied `[[nodiscard]]` (#171644)

`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/multimap
DeltaFile
+109-0libcxx/test/libcxx/diagnostics/multimap.nodiscard.verify.cpp
+53-39libcxx/include/map
+162-392 files

LLVM/project b9d1432flang-rt/lib/runtime external-unit.cpp

[flang-rt][device] Use snprintf result for length (#172239)

The buffer might not be null terminated on the device and result in 1
byte invalid read when trying to get the length.
DeltaFile
+2-2flang-rt/lib/runtime/external-unit.cpp
+2-21 files

LLVM/project adaca13bolt/include/bolt/Core BinaryContext.h, bolt/include/bolt/Passes BinaryPasses.h

[BOLT] Introduce getOutputBinaryFunctions(). NFCI (#172174)

To gain better control over the functions that go into the output file
and their order, introduce `BinaryContext::getOutputBinaryFunctions()`.

The new API returns a modifiable list of functions in output order.

This list is filled by a new `PopulateOutputFunctions` pass and includes
emittable functions from the input file, plus functions added by BOLT
(injected functions).

The new functionality allows to freely intermix input functions with
injected ones in the output, which will be used in new PRs.

The new function replaces `BinaryContext::getSortedFunctions()`, but
unlike its predecessor, it includes injected functions in the returned
list.
DeltaFile
+22-0bolt/lib/Passes/BinaryPasses.cpp
+4-10bolt/lib/Core/BinaryContext.cpp
+9-0bolt/include/bolt/Passes/BinaryPasses.h
+6-3bolt/include/bolt/Core/BinaryContext.h
+3-3bolt/lib/Passes/SplitFunctions.cpp
+1-5bolt/lib/Core/BinaryEmitter.cpp
+45-214 files not shown
+51-2310 files

LLVM/project a45da41llvm/docs MemProf.rst UserGuides.rst

Add documentation for MemProf.

Generated with the help of Gemini CLI, commands validated with a local
build of LLVM from head and tcmalloc.
DeltaFile
+276-0llvm/docs/MemProf.rst
+5-0llvm/docs/UserGuides.rst
+281-02 files

LLVM/project ca81d7corc-rt CMakeLists.txt, orc-rt/unittests CMakeLists.txt

[orc-rt] Ensure EH/RTTI=On overrides LLVM opts, applies to unit tests. (#172155)

When -DORC_RT_ENABLE_EXCEPTIONS=On and -DORC_RT_ENABLE_RTTI=On are
passed we need to ensure that the resulting compiler flags (e.g.
-fexceptions, -frtti for clang/GCC) are appended so that we override any
inherited options (e.g. -fno-exceptions, -fno-rtti) from LLVM.

Updates unit tests to ensure that these compiler options are applied to
them too.
DeltaFile
+6-2orc-rt/CMakeLists.txt
+1-0orc-rt/unittests/CMakeLists.txt
+7-22 files

LLVM/project 21a25f4clang-tools-extra/clang-tidy/modernize UseRangesCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Suggest `std::views::reverse` instead of `std::ranges::reverse_view` in `modernize-use-ranges` (#172199)

`std::views::FOO` should in almost all cases be preferred over
`std::ranges::FOO_view`. For a detailed explanation of why that is, see
https://brevzin.github.io/c++/2023/03/14/prefer-views-meow/. The TLDR is
that it's shorter to spell (which is obvious) and can in certain cases
be more efficient (which is less obvious; see the article if curious).
DeltaFile
+7-7clang-tools-extra/test/clang-tidy/checkers/modernize/loop-convert-reverse.cpp
+4-4clang-tools-extra/test/clang-tidy/checkers/modernize/use-ranges.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+2-3clang-tools-extra/clang-tidy/modernize/UseRangesCheck.cpp
+2-2clang-tools-extra/docs/clang-tidy/checks/modernize/loop-convert.rst
+2-2clang-tools-extra/docs/clang-tidy/checks/modernize/use-ranges.rst
+22-182 files not shown
+24-218 files