[clang-format] Add BreakFunctionDeclarationParameters option. (#196567)
Adds an option the break function declaration parameters, always putting
them on the next line after the function opening parentheses.
This is an equivalent of `BreakFunctionDefinitionParameters`, but for
function declarations.
---------
Co-authored-by: Lukas Jirkovsky <lukas.jirkovsky at aveco.com>
clang: Refactor handling of offload sanitizer arguments
Previously the AMDGPU toolchains hackily handled -fsanitize arguments.
They would lie and report that all host side sanitizers are available,
then TranslateArgs would filter out the device side cases that do not
work, providing diagnostics for the skipped cases. Move that logic
into the base sanitizer argument parsing.
This makes the produced diagnostics more consistent. Previously we
would get repeated warnings when a sanitizer is fully unsupported
by amdgpu, which should now be once for the toolchain. These could
be further improved; we're printing the specific field of -fsanitize
in more cases where it could be skipped. In other cases we have the
opposite problem, where we aren't reporting the exact sanitizer
from the -f flag in the case that depends on a subtarget feature.
This will help fix other broken target specific flag forwarding bugs
in the future.
Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
clang: Refactor handling of offload sanitizer arguments
Previously the AMDGPU toolchains hackily handled -fsanitize arguments.
They would lie and report that all host side sanitizers are available,
then TranslateArgs would filter out the device side cases that do not
work, providing diagnostics for the skipped cases. Move that logic
into the base sanitizer argument parsing.
This makes the produced diagnostics more consistent. Previously we
would get repeated warnings when a sanitizer is fully unsupported
by amdgpu, which should now be once for the toolchain. These could
be further improved; we're printing the specific field of -fsanitize
in more cases where it could be skipped. In other cases we have the
opposite problem, where we aren't reporting the exact sanitizer
from the -f flag in the case that depends on a subtarget feature.
This will help fix other broken target specific flag forwarding bugs
in the future.
Co-authored-by: Claude Sonnet 4 <noreply at anthropic.com>
[AMDGPU] Add VOP1 DPP8 pseudo infrastructure
Add VOP_DPP8_Pseudo/VOP1_DPP8_Pseudo classes for DPP8 instructions, similar to
the existing VOP_DPP_Pseudo/VOP1_DPP_Pseudo pattern.
[libc++] Require the exact assignment expression to be trivial in __uninitialized_allocator_copy_impl
__uninitialized_allocator_copy_impl has an optimization that replaces allocator_traits::construct with std::copy for raw pointer ranges when the element type is trivially copy constructible and trivially copy assignable.
The copy-assignment trait only checks whether assignment from const T& is trivial. That is weaker than the expression used by std::copy, which evaluates *out = *in. If overload resolution selects a different non-trivial assignment operator for that expression, std::copy can call that operator on uninitialized storage.
Check is_trivially_assignable<_Out&, _In&> instead. This matches the assignment expression used by std::copy, preserves the optimized path when that assignment is actually trivial, and falls back to placement construction otherwise.
Add a regression test with a type whose defaulted copy assignment is trivial but whose templated assignment operator is selected for non-const lvalue sources.
Tested with:
~/llvm-project/build-libcxx-fresh/bin/llvm-lit ~/llvm-project/libcxx/test/libcxx/memory/uninitialized_allocator_copy_template_op_assign.pass.cpp ~/llvm-project/libcxx/test/libcxx/memory/uninitialized_allocator_copy.pass.cpp -q
[SelectionDAG] Don't convert sextload to zextload through a multi-use freeze (#196700)
Resolves #196590.
The patch https://github.com/llvm/llvm-project/pull/189317 to teach
DAGCombiner to look through freeze incorrectly introduce a miscompile of
sext -> zext. This resolves resolves the miscompile.
[clang-tidy] Correct `std::has_one_bit` to `std::has_single_bit` in `modernize-use-std-bit` (#196721)
There isn't `std::has_one_bit` in standard library, the function checks
if a number is an integral power of 2 is `std::has_single_bit`.
https://en.cppreference.com/cpp/header/bit
[VectorCombine] foldShuffleChainsToReduce - add support for partial vector reductions (#195119)
Extend foldShuffleChainsToReduce to recognize partial reduction patterns where only a subvector of the full vector is being reduced.
For example, a <16 x i16> vector where the shuffle chain only reduces the lower 8 elements can now be folded into:
shufflevector (extract lower <8 x i16>) + vector.reduce.smax
The detection works by noticing when the bottom-up walk through the
shuffle/op chain ends before consuming the full vector. The number of
levels visited determines the subvector size (2^levels), and an
extract_subvector + scalar reduction replaces the original chain when
profitable.
Fixes #194617
[BPF] Support Stack Arguments (#189060)
Currently, bpf program and kfunc only support 5 register parameters. As
bpf community and use cases keep expanding, there are some need to
extend 5 register parameters by allocating additional parameters on
stack. There are two main use cases here:
1. Currently kfunc is limited to 5 register parameters. In some special
situation, people may want to have more than 5 parameters. One of
example is for sched_ext.
2. Allowing more stack parameters can make bpf prog writer easier since
they do not need to carefully limit the number of parameters for their
programs.
The following is the high-level design:
- Use bpf register R11 as the frame pointer to stack parameters. This is
to avoid mixing stacks due to R10.
- Stack parameters must be after 5 register parameters.
- All parameters should be at most 16 bytes as ByVal parameters are not
supported.
[43 lines not shown]
[RISCV][NFC] Rename `Zvvmm` instruction file to `Zvvm` (#196692)
Renames `RISCVInstrInfoZvvmm.td` to `RISCVInstrInfoZvvm.td` so `Zvvmm`
and `Zvvfmm` share the same IME instruction file according to the spec.
And all future instructions from the `Zvvm family` will be placed here
too.
This PR is required for reviewing #196486 in order to make GitHub show
the diff correcrly.
[CI] Ignore TidyFastChecks.inc for formatter CI. NFC. (#196682)
`TidyFastChecks.inc` is generated and its contents should not be checked
by clang-format CI workflow. Add a local `.clang-format-ignore` entry so
the PR formatting check does not report diffs for this file.
Related run:
https://github.com/llvm/llvm-project/pull/194516#issuecomment-4332061836
[clang-tidy] Avoid `use-nodiscard` false positives for class templates (#196661)
Do not suggest adding `[[nodiscard]]` to functions returning a class
template specialization whose primary template is already marked
`[[nodiscard]]`.
Class template specializations do not carry the `[[nodiscard]]`
attribute on their own declarations, so `modernize-use-nodiscard`
previously missed this case and emitted redundant diagnostics for return
types such as:
```cpp
template <class T>
struct [[nodiscard]] Result;
Result<int> f() const;
```
Fixes #163425.
[ObjectYAML][NFC] Extract BBAddrMap YAML types into shared namespace (#196019)
Move BBAddrMapEntry and PGOAnalysisMapEntry out of namespace ELFYAML
into a new format-agnostic namespace BBAddrMapYAML so that COFF
YAML support can reuse the same schema and MappingTraits.
[AArch64][NFC] Remove unused TRI member from class (#184363)
I’ve removed the TRI member and its initialization, leaving only MRI and
TII as the stored pointers.
---------
Co-authored-by: Benjamin Maxwell <benjamin.maxwell at arm.com>
[AArch64][GlobalISel] Enable BF16 legalization for fadd and friends. (#196081)
This enabled bf16 promotion for the following operations in GISel,
promoting them to f32 and truncating the result back:
G_FADD, G_FSUB, G_FMUL, G_FDIV, G_FMA, G_FSQRT, G_FMAXNUM, G_FMINNUM,
G_FMAXIMUM, G_FMINIMUM, G_FCEIL, G_FFLOOR, G_FRINT, G_FNEARBYINT,
G_INTRINSIC_TRUNC, G_INTRINSIC_ROUND, G_INTRINSIC_ROUNDEVEN