[lld][COFF] Restore `lto-embed-bitcode` and `-fembed-bitcode` Bitcode Embedding Features (#188398)
Removes the patches introduced by #150897 which broke LTO embed
documented features for creating whole-program-bitcode representations
of executables, used in production analysis/rewriting toolsets. This was
a documented feature available up until 21.1.8 broken by 22.x release.
This previously allowed the users to have a whole-program-bitcode
section `.llvmbc` embedded inside of the final executable.
(cherry picked from commit 1e99c9e4c7e82c8417e4bdb0d1cb3b86e6640c6c)
[NFC] Remove stray files from top level directory (#189563)
Added untracked files into the top level directory by mistake, reverting
the change in this PR.
Co-authored-by: himadhith <himadhith.v at ibm.com>
[clang-repl] Fix C89 incompatible keywords (#189432)
Restrict and inline keywords are removed for C89 interpreter since these
keywords caused fail at runtime preamble.
Fixes #189088
(cherry picked from commit 8bd83048084c27615e9536227fbb2545472915e7)
[analyzer][NFC] Reorganize bstring.cpp tests (#188709)
This change eliminates preprocessor-based suppression of test cases by
introducing multi-prefix verify options to run-lines. This slightly
increases coverage.
[lldb] Only create RegisterTypeBuilderClang plugin once (#189393)
This plugin creates types based on information from target XML, which is
parsed only once per session. It has internal logic to reuse created
types, but the plugin itself was being remade every time a type was
requested.
OpenMP: Reimplement getOffloadArch
This function made no sense at all. It was scanning through
the feature map looking for something that parsed as an OffloadArch.
Directly compute the arch from the target device.
I don't know why there isn't just an OffloadArch in TargetOpts,
this shouldn't really require parsing.
[clang] fix getReplacedTemplateParameter for function template specializaions
This fixes the transformation of substituted constant template parameters by
providing the instantiated parameter type for the function template
specialization case.
This fixes a regression introduced in #161029 which will be backported to llvm-22, so there are no release notes.
Fixes #188759
[X86][GISel] Avoid creating subreg def operands in emitInsertSubreg (#189408)
emitInsertSubreg builds a COPY with a subregister def operand, but these
probably should not be allowed in SSA MIR. Change it to build an
equivalent use of INSERT_SUBREG instead.
Wasm: add support for `swifttailcc` calling convention (#188296)
Wasm backend already supports tail calls where available, we only need
to enable corresponding branches for this calling convention.
[AMDGPU][SIMemoryLegalizer] Consider scratch operations as NV=1 if GAS is disabled
- Clarify that `thread-private` MMO flag is still useful.
- If GAS is not enabled (which is the default as of last patch), consider an op as `NV=1` if it's a `scratch_` opcode, or if the MMO is in the private AS.
- Add tests for the new cases.
- Update AMDGPUUsage GFX12.5 memory model
[llvm-dwarfdump] Support R_AARCH64_TLS_DTPREL64 in Object/RelocationResolver.cpp (#187649)
In patch https://github.com/llvm/llvm-project/pull/146572 we have plan
to emit R_AARCH64_TLS_DTPREL64. This give us the warning while using
llvm-dwarfdump for the object file which has tls variables -
warning: failed to compute relocation: R_AARCH64_TLS_DTPREL64, Invalid
data was encountered while parsing the file
To fix this warning we have mark the relocation as supported however
final absolute address of a TLS variable is determined at runtime,
resolving to the symbol's section-relative offset in the object file is
mitigate the warning.
[SelectionDAG] Expand CTTZ_ELTS[_ZERO_POISON] and handle legalization (#188691)
This is a second attempt at "[SelectionDAG] Expand
CTTZ_ELTS[_ZERO_POISON] and handle splitting" (#188220)
That PR had to be reverted in 7d39664a6ae8daaf186b65578492244d96a50bf2
because we had crashes on AMDGPU since we didn't have scalarization
support, and other crashes on PowerPC because we didn't handle the case
when a vector needed widened. Tests for these are added in
AMDGPU/cttz-elts.ll, RISCV/rvv/cttz-elts-scalarize.ll and
PowerPC/cttz-elts.ll.
The former crash has been fixed by adding
DAGTypeLegalizer::ScalarizeVecOp_CTTZ_ELTS.
The second crash has been fixed by reworking
TargetLowering::expandCttzElts. The expansion for CTTZ_ELTS is nearly
identical to VECTOR_FIND_LAST_ACTIVE, except it uses a reverse step
vector and subtracts the result from VF. The easiest way to fix these
[6 lines not shown]
AMDGPU: Match fract from compare and select and minimum
Implementing this with any of the minnum variants is overconstraining
for the actual use. Existing patterns use fmin, then have to manually
clamp nan inputs to get nan propagating behavior. It's cleaner to express
this with a nan propagating operation to start with.
AMDGPU: Match fract pattern with swapped edge case check (#189081)
A fract implementation can equivalently be written as
r = fmin(x - floor(x))
r = isnan(x) ? x : r;
r = isinf(x) ? 0.0 : r;
or:
r = fmin(x - floor(x));
r = isinf(x) ? 0.0 : r;
r = isnan(x) ? x : r;
Previously this only matched the previous form. Match
the case where the isinf check is the inner clamp. There are
a few more ways to write this pattern (e.g., move the clamp of
infinity to the input) but I haven't encountered that in the wild.
The existing code seems to be trying too hard to match noncanonical
variants of the pattern. Only handles the result that all 4 permutations
of compare and select produce out of instcombine.
[libc++][AIX] Fix force_thread_creation_failure by using RLIMIT_THREADS (#188787)
This patch fixes the test `force_thread_creation_failure.cpp` on AIX by
using platform specific `RLIMIT_THREADS` which helps in restricting the
thread creation as `RLIMIT_NPROC` on AIX restricts processes and not
threads.
---------
Co-authored-by: himadhith <himadhith.v at ibm.com>
[RISCV] Fix discarded return value in RISCVOperand::print for FRM (#189530)
The roundingModeToString() return value was not being written to the
output stream, causing FRM operands to print as "<frm: >" with no
rounding mode name in debug output.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>