[SLP]Fix scheduling of copyable bundle with commutative op used outside parent PHI
The previous (V, Op) pair insert was a no-op since V is unique per iteration.
Replace it with a hasOneUse() fast path plus a check that bails only when I
has a user outside the grandparent PHI's Scalars. Uses within the same
vectorized PHI are tracked by the existing dep machinery; an external user
(e.g. a scalar PHI in a different block) is what trips scheduleBlock's
"must be scheduled at this point" assertion.
Fixes #193315.
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/193566
[CIR] Support guard COMDAT for weak linkage in LoweringPrepare (#193274)
Static locals inside inline functions get `linkonce_odr` linkage, and
their guard variables need their own COMDAT groups so the linker can
deduplicate them across TUs. We were hitting an NYI error for this case
in `LoweringPrepare`.
The fix is straightforward: set `guard.setComdat(true)`, which makes
`LowerToLLVM` create a per-symbol COMDAT selector — the same thing
classic codegen does at `ItaniumCXXABI.cpp:2798`.
I ran into this while trying to compile the Bullet physics engine
through CIR. Functions like `btMatrix3x3::getIdentity()` use this
pattern (return a reference to a function-local static from an inline
member function), and 6 of the 121 source files were failing because of
it. With this fix, all 121 compile cleanly.
Made with [Cursor](https://cursor.com)
Reland: [MemProf] Dump inline call stacks as optimization remarks (#193545)
This iteration limits the test case to x86_64-linux to prevent bot
failures.
---
This patch teaches the MemProf matching pass to dump inline call
stacks as analysis remarks like so:
frame: 704e4117e6a62739 main:10:5
frame: 273929e54b9f1234 foo:2:12
inline call stack: 704e4117e6a62739,273929e54b9f1234
The output consists of two types of remarks:
- "frame": Acts as a dictionary mapping a unique MD5-based FrameID
to source information (function name, line offset, and column).
[5 lines not shown]
Revert "[clang] fix matching constrained out-of-line definitions of class specialization member function templates" (#193558)
Reverts llvm/llvm-project#192806 , which is causing the compiler to
reject some valid code.
Loosen check for clang version string in test to work when setting CLANG_VENDOR. (#192961)
We are trying to update our buildbot to use the `-DCLANG_VENDOR` and
`-DCLANG_VENDOR_UTI` options, but need to fix some tests first. This is
one of them.
---------
Co-authored-by: Jannick Kremer <jannick.kremer at mailbox.org>
Co-authored-by: Vlad Serebrennikov <serebrennikov.vladislav at gmail.com>
IR: Allow !fpmath metadata on homogeneous float structs
This matches the logic for fast math flags / nofpclass, and allows
marking llvm.sincos calls with !fpmath.
[GlobalISel] Change SSUBO to do (LHS < RHS) XOR (RESULT < 0) (#191744)
Refactor lowerSADDO_SSUBO in LegalizerHelper so addition and subtraction
use separate, clearly named paths.
SADDO: unchanged meaning: overflow when (result < LHS) disagrees with
(RHS < 0) (signed compares).
SSUBO: use the equivalent formulation: overflow when (LHS < RHS)
disagrees with (result < 0) instead of (result < LHS) vs (RHS > 0).
[libc] Replace check-libc with lit-based test execution (#184163)
Now that check-libc-lit has been validated alongside check-libc, make
lit the default test runner by renaming check-libc-lit to check-libc.
Remove the old CMake-driven check-libc custom target.
[VPlan] Use MaxRuntimeStep in materializeVectorTC to simplify middle br. (#193067)
For scalable vectors, pass the maximum runtime step to
materializeVectorTripCount. Use it to simplify the vector trip count to
the original trip count directly, if MaxRuntimeSteps divides the
original trip count without remainder.
In those cases, all lower power-of-2 vscales will divide the rip count
without remainder.
PR: https://github.com/llvm/llvm-project/pull/193067
[compiler-rt] [Darwin] Enable arm64e tests on macOS (#193391)
This enables compiler-rt tests on Darwin arm64e (when supported by the
linker).
Note that arm64e is not enabled for sanitizers yet, but this does add
test coverage for builtins.
rdar://175303507
[NFC][MachineBlockHashInfo] Add static asserts to guard agains hash_16_bytes changes (#192862)
`hashing::detail::hash_16_bytes` is not guaranteed to be stable across
different versions of LLVM, it can change any time.
We put asserts here, so if it changed, author don't forget to work
around them here.
Revert "[SelectionDAG] Salvage debuginfo when combining load and z|s ext instrs. (#188544)" (#193554)
This reverts commit fe5d5b762ab3b92b18b56f413965abb81a459ac6.
Reverted because of https://github.com/llvm/llvm-project/issues/193475
clang crashes on valid code at -O{2,3} on x86_64-linux-gnu: Assertion
`N->getOpcode() != ISD::DELETED_NODE && "DELETED_NODE in CSEMap!"'
failed
[lldb] Scope symbol lookups to specific modules in ObjC/SystemRuntime plugins (#193379)
This narrows `FindSymbolsWithNameAndType` calls from searching all
loaded images to the specific module that owns the symbol (Foundation,
CoreFoundation, libBacktraceRecording.dylib).
The arclite fallback in `CalculateHasNewLiteralsAndIndexing` still
searches all images because libarclite is a static library linked into
the main executable.
[Runtimes] Allow HandleLibc.cmake to be called multiple times (#193540)
Summary:
This needs to check to see if it's already been called now that we want
to use it more places than just libcxx.
[LegalizeTypes][DAG] Use SHL(X,1) instead of ADD(X,X) for variable vector indices for extraction/insertion legalization (#188277)
Avoid ADD(X,X) as it doesn't correctly handle undef elements and helps avoid some FREEZE() fold headaches
Resurrects #86857
[lldb] Decorate tests that use threading (#193117)
Add a new decorator `skipIfTargetDoesNotSupportThreads` to skip tests
that use threading. This is motivated by running the test suite
targeting WebAssembly, where `wasip1` does not support threads. There
are variants that do support threading (e.g. `wasip1-threading`) that
the current implementation accounts for.
[libc][NFC] Fix minor RPC warnings (#192997)
Summary:
Fix some warnings that show up with strict warnings set, reduces noise
when used as a header onyl library in projects.
[MLIR][XeGPU] Do not use ocloc lib if LLVM_BUILD_LLVM_DYLIB is ON (#193259)
This fixes LLVM dylib build in environments with installed ocloc.
The problem is that LLVM shared lib is never linked with ocloc and the
linker fails to resolve the symbols `oclocInvoke` and `oclocFreeOutput`.
[libc] Fix .params file generation for integration tests (#193544)
Update add_integration_test to include loader arguments in the .params
file. The lit format already supported three-part .params files, but
add_integration_test was only generating two parts.
[NFC][ADT] Make a few functions constexpr (#193302)
So we can use them in static_asserts in #192862.
It converts what ever is trivially possible. In future more can
be converted as well, if we constexpr fetch32/fetch64.
---------
Co-authored-by: Matt Arsenault <Matthew.Arsenault at amd.com>
[ELF] Factor linker-script dispatch loops into helpers. NFC (#193547)
Extract the per-token dispatch inside readLinkerScript, readSections,
readOutputSectionDescription, and readMemory into four new helpers.
Preparatory for making INCLUDE run a nested parse (#193427).