[HWASan] [compiler-rt] Add tag_bits option to HWASan alloc (#192386)
This can be used to make sure the allocator does not use the top bit of
the pointer. This is useful when HWASan is used in combination with
signed-integer-overflow detection. Some code uses arithmetic on intptr_t
that overflows for sufficiently large pointers.
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
[HWASan] [compiler-rt] Add tag_bits option to HWASan alloc (#191089)
This can be used to make sure the allocator does not use the top bit of
the pointer. This is useful when HWASan is used in combination with
signed-integer-overflow detection. Some code uses arithmetic on intptr_t
that overflows for sufficiently large pointers.
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
[X86][APX] Reset SubReg for dst and check isVirtual before getInterval/getPhys (#191765)
We have made sure dst operand never has a SubReg. We need to make sure
register is virtual when calling getInterval/getPhys.
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[RISCV] Prefer LUI over PLUI.H on RV32. (#192340)
I don't think any of the cases PLUI.H can handle would be eligible for
C.LUI, but still figured it was best to use base ISA instructions when
possible.
[clang] Make serenity.cpp tests pass on clang-with-thin-lto-ubuntu (#192231)
LTO_FULL-NOT was definitely too generic and prone to matching unrelated
content. It would, as an example, match against the build path on
clang-with-thin-lto-ubuntu builder [1].
Making the match more restrictive should avoid this kind of issues.
[1] https://lab.llvm.org/buildbot/#/builders/127/builds/6956
[mlir][NVGPU] Validate mmaShape has 3 elements in MmaSyncOp/MmaSparseSyncOp (#190928)
Add validation in MmaSyncOp::verify and MmaSparseSyncOp::verify to
ensure `mmaShape` contains exactly 3 elements before calling
getMmaShapeAsArray() to avoid crash. Fixes
https://github.com/llvm/llvm-project/issues/173378.
Redesign handling of anyAppleOS availability attribute (#190817)
Previously, when processing an anyAppleOS availability attribute, clang
replaced it with an implicit platform-specific attribute (e.g., ios,
macos) inferred for the current target. Only the introduced version of
the original anyAppleOS attribute was preserved (as a field on the
inferred attr). This was insufficient for clients such as Swift that
need access to the full original attribute, including deprecated,
obsoleted, and message fields.
This patch preserves the original anyAppleOS attribute on the decl and
attaches the inferred platform-specific attribute to it as a child via
the new InferredAttr field. Most callers use getEffectiveAttr() to
transparently get the inferred attr when present, preserving existing
behavior. Fix-it hints use the presence of an inferred attr to decide
whether to emit "anyAppleOS" or a platform-specific name in the
@available expression. The one behavioral change is in documentation
XML, where availability info is now emitted for both the anyAppleOS attr
and the inferred platform-specific attr.
[4 lines not shown]
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388
[mlir][CSE] Fix dominanceInfo analysis preservation (#192279)
The CSE pass calls `markAnalysesPreserved<DominanceInfo,
PostDominanceInfo>()` at the end. While CSE erases operations, it does
not remove their corresponding dominator trees, causing them to be
unnecessarily preserved in memory. This PR addresses the issue by
explicitly calling invalidate within CSE to clean up the dominator trees
for those erased operations.
[offload] Fix kernel record/replay and add extensible mechanism (#190588)
This commit fixes the kernel record replay on both AMD and CUDA devices. It
also re-organizes the record replay code, moves the whole code to separate
files, and makes it extensible to support other record formats (potentially in
the future). The environment variables for controlling the recording have also
been modified.
[DAGCombiner] Fold (or (seteq X, 0), (seteq X, -1)) to (setult (add X, 1), 2) (#192183)
This is the De Morgan dual of the existing fold:
(and (setne X, 0), (setne X, -1)) --> (setuge (add X, 1), 2)
The or-of-equalities version checks if X is either 0 or -1, which is
equivalent to (X+1) < 2 (unsigned). This reduces two comparisons and
an or to one add and one comparison.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[AsmPrinter] Fix AsmPrinterAnalysis::Result::invalidate to take PreservedAnalyses by const reference (#191742)
The invalidate method was taking PreservedAnalyses by value instead of
by const reference, causing an unnecessary copy on every invalidation
query. All other analysis invalidate methods in LLVM use const
reference.
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[clang] implement CWG2064: ignore value dependence for decltype
The 'decltype' for a value-dependent (but non-type-dependent) should be known,
so this patch makes them non-opaque instead.
This patch also implements what's neceessary to allow overloading
on pure differences in instantiation dependence, making `std::void_t`
usable for SFINAE purposes.
This also readds a few test cases from da98651, which was a previous attempt
at resolving CWG2064.
Fixes #8740
Fixes #61818
Fixes #190388