[ORC] Fix unchecked Expected<T> in ELFDebugObjectPlugin::FinalizePromise (#172904)
If `Alloc.finalize()` fails in the post-allocation pass, we store the
error in `FinalizePromise`. If we don't reach the post-fixup pass
afterwards the error will leak. This patch adds another case in the
DebugObject destructor that will check the `Expected<T>` and report the
error.
[InstCombine] Propagate poison through fshl and fshr intrinsics (#172859)
Currently these intrinsics output `undef` on poison, which triggers CI
errors on PRs that want to add poison tests for funnel shifts (such as
#172723). Let's make `fshl` and `fshr` propagate poison instead.
InstCombine: Fold out nanless canonicalize pattern
Pattern match a wrapper around llvm.canonicalize which
weakens the semantics to not require quieting signaling
nans. Depending on the denormal mode and FP type, we can
either drop the pattern entirely or reduce it only to
a canonicalize call. I'm inventing this pattern to deal
with LLVM's lax canonicalization model in math library
code.
The math library code currently has explicit checks for
the denormal mode, and conditionally canonicalizes the
result if there is flushing. Semantically, this could be
directly replaced with a simple call to llvm.canonicalize,
but doing so would incur an additional cost when using
standard IEEE behavior. If we do not care about quieting
a signaling nan, this should be a no-op unless the denormal
mode may flush. This will allow replacement of the
conditional code with a zero cost abstraction utility
[17 lines not shown]
[clang-tidy][NFC] Refactor `bugprone-use-after-move` check (#172219)
This change is a necessary step for a subsequent PR that will enhance
the `bugprone-use-after-move` check to correctly handle cases where
variables are re-initialized inside captured lambdas, which currently
lead to FPs.
Part of #172018
[lldb-dap] Migrate restart request to structured types (#172488)
This patch migrates `restart` request to structured types. Also, I added
some checks that at least one of the required fields was provided for
`launch` and `attach` requests. Maybe I missed some possible
configurations, so please double check.
[ADT] Refactor Bitset to Be More Constexpr-Usable (#172062)
This patch refactors some essential `Bitset` member functions to be
`constexpr` and adds more useful member functions. Unit tests have been
added to `BitsetTest.cpp` to cover both runtime and `consteval` context
correctness.
The thought of refactor was brought up in this context:
https://discourse.llvm.org/t/rfc-out-of-lanebitmask-bits-again/88613.
[libc++] Optimize vector<bool>::resize() (#172853)
This both simplifies the implementation and improves the performance,
since the compiler is better able to see through what's going on.
```
Benchmark old new Difference % Difference
-------------------------------------------------------------------------- -------------- -------------- ------------ --------------
vector<bool>(const_vector<bool>&) 11.99 12.26 0.27 2.25%
vector<bool>(size_type,_const_value_type&) 9.24 9.29 0.05 0.54%
vector<bool>(vector<bool>&&,_const_allocator_type&)_(different_allocators) 14.26 14.35 0.09 0.65%
vector<bool>(vector<bool>&&,_const_allocator_type&)_(equal_allocators) 2.67 2.67 -0.01 -0.29%
vector<bool>::reserve() 9.30 9.29 -0.01 -0.12%
vector<bool>::resize() 15.14 13.43 -1.71 -11.28%
Geomean 9.17 9.03 -0.14 -1.48%
[InstCombine] Don't fold struct-ret intrinsics into vector selects (#173062)
Folding struct-ret intrinsics like `@llvm.sincos.v4f32` into selects
with vector conditions is invalid (the result must be a vector).
Revert "[libc++] Don't try to be compatible with libstdc++ in __libcpp_refstring on iOS (#170816)" (#173099)
This reverts commit b2ddb909cf. Sadly, I was wrong when I said that
Apple didn't ship libstdc++.dylib on iOS. We actually still do, it's
just not part of the shared cache, which is why I missed it.
Hence, it is still possible to encounter libstdc++.dylib in processes
running on iOS.
[AMDGPU] Limit allocation of lo128 registers for occupancy
Parent change allows allocation of lo128 VGPRs from all 4 banks.
That may result in the undesired allocation leaving a hole of
maximum 128 registers in case if for example v0-v127 are allocated,
and v128-v255 are free.
Limit the available allocation order to the occupancy. Both hard
occupancy limits and occupancy achieved during scheduling are
considered. That is better to spill a register than to drop occupancy
in this case.
[AMDGPU] Allow allocation of lo128 registers from all banks
We can encode 16-bit operands in a short form for VGPRs [0..127].
When we have 1K registers available we can in fact allocate 4
times more from all 4 banks. That, however, requires an allocatable
class for these operands. When for most of the instructions it will
result in the VOP3 longer form, for V_FMAAMK/FMADAK_F16 it will
simply prohibit the encoding because these do not have VOP3 forms.
A straight forward solution would be to create a register class
with all registers having bit 8 of the encoding zero, i.e. to
create a register class with holes punched in it: [0-127, 256-383,
512-639, 768-895]. LLVM, however, does not like register classes
with punched holes when they also have subregisters. The cross-
product of all classes explodes and some combinations of a 'class
having a common subreg with another' becomeing impossible. Just
doing so explodes our register info to 4+Gb, uncompilable too.
The solution proposed is to define _lo128 RC with contigous 896
[17 lines not shown]
[clang] Add support for consteval null terminated strings
Adds support for null terminated strings produced by constexpr
evaluation. This makes it possible to perform analysis of format
strings that previously were not possible, and is needed in the
future to support __ptrauth qualifier options.
[FindGRPC.cmake] Make sure that `PACKAGE_VERSION` is not overwritten when doing `find_package(gRPC)` (#173115)
`PACKAGE_VERSION` is important since it sets the `LLVM_VERSION_STRING`
string.
[clang] Add support for consteval null terminated strings
Adds support for null terminated strings produced by constexpr
evaluation. This makes it possible to perform analysis of format
strings that previously were not possible, and is needed in the
future to support __ptrauth qualifier options.
[CIR] Add 'get element' for array index ops (#172897)
This is a refactor/upstream/etc of:
https://github.com/llvm/clangir/pull/1748
This modifies our array-index operations to use a specific operation
(GetElementOp). According to the original patch commit message, this
replaces nearly 50% of ptr_stride operations in single source tests!
[RISCV] Fix Zvfbfa tests from #171794 to mitigate UTC bug. NFCI (#173125)
Context:
https://github.com/llvm/llvm-project/pull/171794#discussion_r2614489484
For some reasons, UTC is unable to merge the 'ZVFHMIN' and 'ZVFBFA'
CHECK lines in some of the test functions, and emits incorrect CHECK
lines for them, once you run UTC again on the file.
This hinders the ability to update these tests in bulk, as one has to
manually remove the excessed ZVFBFA lines. While I don't know how to fix
UTC at this moment, I found a workaround that simply re-orders these two
check prefixes.
This is effectively a NFC