[DAGCombiner] Fold (or (seteq X, 0), (seteq X, -1)) to (setult (add X, 1), 2) (#192183)
This is the De Morgan dual of the existing fold:
(and (setne X, 0), (setne X, -1)) --> (setuge (add X, 1), 2)
The or-of-equalities version checks if X is either 0 or -1, which is
equivalent to (X+1) < 2 (unsigned). This reduces two comparisons and
an or to one add and one comparison.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[AsmPrinter] Fix AsmPrinterAnalysis::Result::invalidate to take PreservedAnalyses by const reference (#191742)
The invalidate method was taking PreservedAnalyses by value instead of
by const reference, causing an unnecessary copy on every invalidation
query. All other analysis invalidate methods in LLVM use const
reference.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[lld][MachO] Key branch-extension thunks on (referent, addend) (#191808)
TextOutputSection::finalize ignored branch relocation addends. Two call
sites branching to the same symbol with different addends therefore
collapsed onto a single thunk.
Key thunkMap on (isec, value, addend) so two call sites with different
addends get independent thunks. The addend is encoded in the thunk's
relocs and is zeroed at the call site after the callee is redirected to
the thunk. Thunk names carry a `+N` suffix when the addend is non-zero.
Reapply "[ObjC][Preprocessor] Handle @import directive as a pp-directive" (#189174)
This PR reapply https://github.com/llvm/llvm-project/pull/157726.
Depends: https://github.com/llvm/llvm-project/pull/107168
This patch handle `@import` as a preprocessing directive, and since this
patch, the following import directive will be ill-formed:
```
@import Foo\n;
```
---------
Signed-off-by: yronglin <yronglin777 at gmail.com>
[CIR] Implement EH handling for base class initializer (#192358)
This implements exception handling when a base class initializer is
called from a dervied class' constructor. The cleanup handler to call
the base class dtor was already implemented. We just needed to push the
cleanup on the EH stack.
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[clang][modules] Diagnose headers owned by multiple modules (#188538)
Add -Wduplicate-header-ownership, an off by default warning that fires
at include time when a header is owned by multiple top-level modules.
This helps catch overlapping module maps that can cause confusing module
resolution.
Assisted-by: claude-opus-4.6
[NFC] Move TimePasses globals from Pass.h to PassTimingInfo.h (#192352)
They don't belong in the legacy pass manager-specific header, they apply
to both pass managers, plus the pass manager isn't the right layer to
put the bools anyway.
[flang][cuda] Avoid false positive on multi device symbol with components (#192177)
Semantic was wrongly flagging derived-type components as two device
resident object. Update how we collect symbols and count the number of
device resident object.
[CIR][ABI] Handle callee-destructed params for trivial_abi (#191257)
Replace errorNYI for isParamDestroyedInCallee with working
implementation: create aggregate temp, mark externally destructed,
emit expr. Unblocks [[trivial_abi]] types on Itanium ABI.
Adds trivial-abi.cpp test covering 17 cases from
CodeGenCXX/trivial_abi.cpp with CIR/LLVM/OGCG checks.
Made with [Cursor](https://cursor.com)
[CIR][ABI][NFC] Add x86_64 ABI parity tests (#191259)
Add three test files for CIR ABI parity on x86_64, all with
CIR/LLVM/OGCG checks:
- uncopyable-args.cpp — 24 functions covering non-copyable and
move-only types (trivial, default-ctor, move-ctor, etc.)
- x86_64-arguments.cpp — 26 functions covering C++ struct passing,
inheritance, member pointers, empty bases, packed structs
- attr-noundef.cpp — 26 functions covering noundef placement on
structs, unions, vectors, member pointers, _BitInt
Made with [Cursor](https://cursor.com)
[CIR][NFC] Convert MissingFeatures::requiresCleanups to errorNYI (#192350)
This change adds errorNYI calls in two places that we previously had
requiresCleanups() missing features markers, adds a more specific
missing feature marker for loops, removes one requiresCleanups() where
the handling was already implemented, and deletes a bunch of missing
feature markers there were never used.
[BOLT][Passes] use ADT containers for instrumentation spanning tree. (#192289)
Swap `std::unordered_map<…, std::set<…>>` for
`DenseMap<…, SmallVector<…>>` in `Instrumentation::instrumentFunction`
and switch read paths from `STOutSet[&BB]` to `find()`. This removes
per-set heap allocations, stops inserting empty buckets on every probe,
and replaces linear `is_contained()` scans over a red-black tree with
linear scans over inline `SmallVector` storage (most basic blocks have
at most a couple of spanning-tree out-edges). NFC.
[CIR][CUDA] Do Runtime Kernel Registration (#188926)
Related:
https://github.com/issues/assigned?issue=llvm%7Cllvm-project%7C179278,https://github.com/llvm/llvm-project/issues/175871
More registration shenanigans -> Generates `__cuda_register_globals`
that associates the fatbin with kernels that contain `__global__`
qualifiers with the runtime.
Generated equivalent runtime code:
``` C
// Called once per kernel to register it with the CUDA runtime.
void __cuda_register_globals(void **fatbinHandle) {
__cudaRegisterFunction(
fatbinHandle,
(const char *)&_Z25__device_stub__kernelfunciii, // host-side stub ptr
(char *)__cuda_kernelname_str, // device-side mangled name
[13 lines not shown]
[CIR] Add address space casts for pointer arguments when creating a call (#192303)
This patch checks if the expected type for an argument is the same as
the actual type. If types are pointers but with different address spaces
then add an address space cast to make the pointer types match.
Assised-by: Cursor / Claude Opus 4.6
[MLIR][XeGPU] Remove create tdesc & update offset op from xegpu dialect (#182804)
This PR removes create_tdesc and update_offset ops from the XeGPU
dialect, as scatter load/store/prefetch now accept memref+offsets
directly.