Reland "[LTO][LLD] Prevent invalid LTO libfunc transforms (#164916)" (#196177)
This reverts commit 24154a55d698a98e7b6d2aae1778b79f15ce5b09.
The issue that lead to this being reverted was subtle, but entirely
downstream. Note that by making LTO DCE more conservative, this patch
may uncover latent undefined references caused by build system issues.
[reland] [lit] [compiler-rt] Add llvm-lit global command cache to speed up test config (#196152)
Re-lands #195888
Fixes two issues:
- `date -Ins` is not available on older macOS versions (I think
pre-15.4). This caused the new `test_cache` test to fail. Switched to
just using `date` + a sleep (with a comment explaining why). Even if the
sleep is too long/short, the test should still pass.
- `functools.cache` is not available on Python 3.8. I've moved the
`_memoize` helper out of TestRunner.py into util.py, and switched to it
instead. I had to make a small change to the memoize helper to support
arbitrary args/kwargs.
[clang][ssaf] Add `clang-ssaf-analyzer` (#196124)
This patch introduces `clang-ssaf-analyzer`, a new SSAF tool that runs whole-program analyses over an `LUSummary` and writes the resulting `WPASuite` to an output file.
[Clang] Produce deterministic hash for anonymous namespaces. (#194542)
This change adds a path substitution for the main module file during
anonymous namespace hash generation using the prefix map specified by
-fmacro-prefix-map option. That ensures deterministic symbol mangling
for reproducible builds.
---------
Co-authored-by: Corentin Jabot <corentinjabot at gmail.com>
[AMDGPU] Validate forced lit64() on VOP3 instructions
Lit64 cannot be used with VOP3* but we did not validate it
in case it can be encoded as lit32, but forced with the
operand modifier.
[flang][FIRToMemRef] lower `fir.coordinate_of` on static-extent arrays to indexed memref (#195404)
`fir.coordinate_of` on static array components (e.g. A%v(i)) was falling
back to a rank-0 scalar memref, losing index information and blocking
saffine parallelization. Lower these to a properly-shaped memref<dims x
T> with explicit indices. Dynamic arrays and struct-element arrays keep
the existing scalar fallback.
[clang][modules] Fix UAF in `InProcessModuleCache` (#196117)
Writing to the module cache would invalidate the read buffer. If the
timing works out just right, this is a use-after-free bug. This PR
prevents that situation by using two buffers in the module cache entry,
and adds a unit test that would previously fail under address sanitizer.
[OpenMP][NFC] Simplify rounding operations (#196155)
Summary:
There were a lot of these cases that did rounding up / down. Make
helpers for them and simplify.
[CIR][OpenMP][MLIR] Allow passing of vfs::FileSystem through ModuleTranslation (#195451)
This change optionally allows passing a pointer to a vfs::FileSystem
through ModuleTranslation down to the OpenMPToLLVMTranslation. This will
prevent IO sandbox errors when enabling OpenMP target regions in CIR,
since accessing the file system must go through the proper API.
Assisted-by: Cursor / claude-4.6-opus-high
[lldb] Add Policy infrastructure (#195762)
Add a generic thread-local policy stack and a Policy struct that
describes what view of the process a thread should see (private reality
vs public illusion) and what operations it is allowed to perform.
This is the infrastructure for replacing ad-hoc host thread identity
checks (CurrentThreadIsPrivateStateThread, IsOnThread, etc.) with a
unified, composable mechanism. No behavioral changes yet -- adoption
will follow in subsequent patches.
rdar://176223894
Signed-off-by: Med Ismail Bennani <ismail at bennani.ma>
Adding the Formal Semantics Working Group to regular syncs. (#196154)
Adding the Formal Semantics Working Group to regular syncs.
The meeting notes google document also includes the meeting link, meeting times (timezones) and link to RFC and discord channel.
[LoopPeel] prevent estimated trip count overflow before peel (#195610)
`if (*EstimatedTripCount + AlreadyPeeled <= MaxPeelCount)` case in
`LoopPeel` can have a possible overflow with the `EstimatedTripCount`
value, which causes a hang with `opt`. Added
`llvm::checkedAddUnsigned()` to prevent this, along with a new
regression from the IR reproducer of #173169.
[AArch64][GlobalISel] Clean up and extend BF16 tests. NFC (#196175)
This attempts to fill in the gap between the different bf16 test files,
making
sure they all contain the same tests.
[clang] correctly handle +/- features when matching modules (#195743)
By sorting and then comparing, we made +sse2 -sse2 equal to
-sse2 +sse2, where the former has sse2 disabled, and the latter
enabled. I verified this is actually the case by compiling the
following:
```
#ifdef __SSE2__
#error X
#endif
```
[SPIRV] Support `Volatile` memory semantics operand in atomic load/store (#195978)
The Vulkan memory model supports the `Volatile` memory semantics being
used with atomic operations. This patch adds the support to the SPIR-V
backend.
When the memory model is OpenCL, the `Volatile` memory semantics is not
supported. In this case, we ignore it and emit a regular `OpAtomicLoad`
or `OpAtomicStore` instruction. It should be safe, because the atomic
operations aren't eliminated anyway.
Assisted-by: Claude Opus 4.6 <noreply at anthropic.com>
[AMDGPU] Validate forced lit64() on VOP3 instructions
Lit64 cannot be used with VOP3* but we did not validate it
in case it can be encoded as lit32, but forced with the
operand modifier.
[BOLT] Gadget scanner: add less strict version of tail call checker
During tail call, it may be worth making sure the link register is as
trusted as during a regular call, though it may require inserting
expensive checking code by the compiler.
On the other hand, with pac-ret hardening enabled, there should be no
reason not to protect tail-calling functions at least as well as those
exited via regular return instruction.
This commit splits tail call checker into two versions: the basic one
which is suitable to make sure regular `PAC*` + `AUT*` are emitted as
needed, and the strict one, that additionally ensures the authentication
(if any) succeeded.
[mlir][spirv] Allow CooperativeMatrixType in Bitcast (#196096)
This makes is consistent with the spec: "Allow the use of OpBitcast on
objects of cooperative matrix type whose Component Type are integer
types with the same Width."
Assisted-by: Codex
[clang-tidy] Overloaded Unresolved member function call can't be static (#191432)
readability-convert-member-functions-to-static incorrectly suggests
making overloaded member function, with lambda function call, as
static (false-positive)
Mark usage of "this" as true, when a call to "UnresolveMemberExpr"
is obvserved
Fixes https://github.com/llvm/llvm-project/issues/171626
[WebAssembly] Add call_ref (0x14), return_call_ref (0x15), and ref.cast (0xfb16) (#195942)
Add MC-layer support for the typed function references opcodes:
- Instruction definitions in WebAssemblyInstrCall.td and
WebAssemblyInstrRef.td. call_ref / return_call_ref / ref.cast came with
the function-references proposal which was folded into wasm-gc, so they
are gated on HasGC (and HasTailCall for return_call_ref).
- Asm-parser hook that accepts the (ty) -> (ty) signature syntax for
call_ref, return_call_ref, and ref.cast, mirroring call_indirect /
return_call_indirect.
- Stack-effect modeling in WebAssemblyAsmTypeCheck so non-trivial
signatures type-check correctly.
- Encoding and disassembly tests under test/MC/WebAssembly.
Codegen does not yet select these opcodes. My motivation is unblocking
LLDB, which uses LLVM's disassembler. We got a report that these
instructions show up as `<unknown>` in LLDB.
rdar://163141531
[AMDGPU] Rework VOPD constraints for gfx12+ with data deps (#191264)
Follow-up to #178772. Relax the constraint that blocks VOPD formation
when SecondMI writes to registers that FirstMI reads from, except if the
resulting VOPD would take multiple cycles to issue. That can happen if
the same source VGPR is used in the same position in the other Op
(AllowSameVGPR), or if one of the following opcodes is used for OpX:
- v_fma_f64
- v_add_f64
- v_mul_f64
- v_max_num_f64
- v_min_num_f64
De-duplicate the check for which instructions can be paired in the
scheduling and formation passes, and use the same check logic in both
passes (previously scheduling was looser).
---------
Co-authored-by: Claude Opus 4.6 <noreply at anthropic.com>
[flang][OpenMP] Support lowering of metadirective (part 2)
Lower non-constant user={condition(expr)} selectors in metadirectives
to a fir.if/else chain.
Only statically applicable when-clauses participate in dynamic
selection. Dynamic conditions are evaluated at runtime in declaration
order, with the best static match, an explicit otherwise/default
clause, or implicit nothing as the final fallback.
This patch is part of the feature work for #188820.
Assisted with copilot and GPT-5.4
Fix dynamic metadirective candidate selection
- Use one scored candidate path for static and dynamic metadirective variants.
- Dynamic user conditions are statically filtered and scored using their
non-user traits, then guarded at runtime with fir.if.
- Keeps construct/device/implementation traits enforced for dynamic
candidates and lets higher-scored static candidates beat lower-scored dynamic
candidates.
- Add regressions for construct mismatch, score ordering, and
implicit-nothing tie-breaking.
[SLP] Treat extracts from undef vectors as real, not free, extracts
tryToGatherSingleRegisterExtractElements classified an extractelement whose
vector operand was undef as a free undef extract via UndefVectorExtracts.
When the remaining extracts already filled the two-vector shuffle budget,
the resulting build vector contained a third distinct vector operand and
tripped the assertion "Expected only 1 or 2 vectors shuffle." in
processBuildVector.
Use isUndefVector with IsPoisonOnly=true so that only extracts from poison
vectors are still treated as free.
Fixes #196015.
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/196150
[libc][stdlib] Add EnvironmentManager (#195260)
Introduced an EnvironmentManager singleton that centralises environment
variable state: the environ array, per-string ownership tracking, and
capacity management. The manager exposes a minimal public API (get,
begin/end iterators) and keeps all internal state private.
Refactored getenv to delegate to EnvironmentManager::get() rather than
directly iterating app.env_ptr.
The ownership tracking and capacity management are preparatory
infrastructure for setenv.
Assisted-by: Automated tooling, human reviewed.