[NFC][SPIRV] Refactor emitGlobalDI to use helpers and store CompileUnitRegMap
This commit refactors emitGlobalDI() to use the extracted helper
functions and adds CompileUnitRegMap to store DebugCompilationUnit
registers for later lookup by function-level debug info emission.
This prepares for function-level debug info emission which needs
to look up compile units.
[NFC][SPIRV] Extract helper functions in SPIRVEmitNonSemanticDI
This commit extracts reusable helper functions to improve code
organization and reduce duplication. This is a pure refactoring
that does not change behavior.
These helpers will be used in subsequent commits to refactor
emitGlobalDI and add function-level debug info emission.
[NFC][SPIRV] Extract helper functions in SPIRVEmitNonSemanticDI
This commit extracts reusable helper functions to improve code
organization and reduce duplication. This is a pure refactoring
that does not change behavior.
These helpers will be used in subsequent commits to refactor
emitGlobalDI and add function-level debug info emission.
Revert "[VPlan] Strengthen materializeFactors with assert (NFC) (#181665)" (#183014)
This PR did not solve the TODO as intended. Reverting so the TODO is not
lost.
This reverts commit aab9412a69a07787e9ec98b25709d709b7b537a6.
[bazel][libc][math] Fix fmul->mul for bf16mul family (#182018) (#183112)
#182018 made these header only but added the build targets as fmul
instead of mul
[X86] lowerV64I8Shuffle - avoid lowerShuffleAsRepeatedMaskAndLanePermute call on VBMI targets (#183109)
Shuffle combining fails to fold the inner shuffles first, but luckily the LanePermuteAnd* methods are enough if we have VPERMB as a fallback
Fixes #137422
[CIR][NFC] Update the constructor sites of `CIRGenFPOptionsRAII` (#182187)
As support for RAII FP options has been upstreamed (#179121), this patch
removes `CIRGenFPOptionsRAII` from the `MissingFeatures` list and
updates its expected constructor sites.
[LLVM][LangRef] Restrict vscale to be a signed power-of-two integer. (#183080)
There is no known requirement to support non-power-of-two
values of vscale and yet said support is leading to unnecessary
complexity within LLVM.
lang/gforth: try to unbreak the port's build against GCC 15
... by pulling two upstream patches. While here, spell out
ANS Forth (1994) in the COMMENT and port description, fix a
typo, and provide a more meaningful MAKE_JOBS_UNSAFE reason.
PR: 293330
[Sema] Guard transformed loop-hint expression before use (#182752)
`TransformLoopHintAttr` called `TransformExpr(...).get()` without
checking that the transformed expression was usable.
For `#pragma GCC unroll v()` instantiated with `v = int`, expression
transformation fails and Clang can assert while validating the loop
hint.
Check `ExprResult::isUsable()` before calling `get()` and keep the
original attribute on failure.
Fixes https://github.com/llvm/llvm-project/issues/49502
[clang][ssaf] Add JSON serialization support for `TUSummary::LinkageTable`
This change adds full read/write support for the
`TUSummary::LinkageTable` field that maps each entity to its linkage
kind. The deserialization step validates that the set of entity ids in
`LinkageTable` exactly matches the set in `IdTable` in a single `O(N log
N)` pass. Existing tests have been updated and new tests have been added
to ensure 100% coverage of the new code.
[CUDA/HIP] Externalize __device__ const variables accessed by host code (#177292)
In standard C++, const variables at namespace scope have internal
linkage. For __device__ const variables, this makes them invisible to
runtime symbol lookup APIs (cudaGetSymbolAddress/hipGetSymbolAddress).
Reading a __device__ const variable from host code is a valid usage
pattern — the host may need to know the value at runtime. This is
also needed by libcudacxx's cuda::get_device_address.
This patch extends the existing CUDADeviceVarODRUsedByHost tracking
to cover __device__ const variables. When host code references such a
variable, it gets externalized (same mechanism used for static device
vars). Variables only used in device code keep internal linkage and
can still be constant-folded.
The fix is in SemaExpr: __device__ const variables are classified as
CVT_Both (due to an implicit CUDAConstantAttr), so the ODR-use
tracking is extended to include CVT_Both variables with an explicit
CUDADeviceAttr, distinguishing them from plain const variables.
[Clang] Fix coroutine promise with inherited allocation functions (#179141)
The compiler frontend crashed when the promise class overloads operator
new/delete without a regular function declaration. This happens when the
promise class derives from a base class and takes the allocation
functions from the base class with:
using Base::operator new;
using Base::operator delete;
This was initially introduced by
1cd59264aa2fb4b0ba70ff03c1298b1b5c21271e.
This should also fix #164088
[Hexagon] Avoid contracting predicates in createHvxPrefixPred (#183081)
The function createHvxPrefixPred should only need to expand a predicate
to match the result's bytes-per-bit. Otherwise, contracting of the
predicate may lead to an input that is shorter than 4 bytes, making it
unsuitable for VINSERTW0.
When calling createHvxPrefixPred for vector concatention, re-group the
inputs to the concat to make sure that the resulting inputs to
createHvxPrefixPred would not need contraction.
Fixes https://github.com/llvm/llvm-project/issues/181362
NAS-139962 / 26.0.0-BETA.1 / Fix MatchNotFound when virt.global table is empty during container migration (#18290)
## Problem
On fresh installs of HM, we will not have a row in `virt.global` table
where `datastore.config` will raise an exception rather then creating a
row.
## Solution
Make sure we gracefully handle this case and do not error out when
migration is triggered automatically on system boot.
clang/AMDGPU: Stop checking for finite only and unsafe math control libraries (#182865)
These will be imminently deleted. Just ignore them if they are not
present.
[SDPatternMatch] Add `m_ConstInt` overloads with `uint64_t`/`int64_t` operands (#182615)
Adds overloads
```cpp
auto m_ConstInt(uint64_t &);
auto m_ConstInt(int64_t &);
```
which behave analogously to `m_ConstInt(APInt &)`, but only match if the
captured integer fits within 64 bits.
[AMDGPU][NFC] Pre-commit memcpy test with complex constant length (#182170)
Test memcpy lowering with complex constant length. Length is given by:
`i64 add (i64 sub (i64 16, i64 ptrtoint (ptr addrspacecast (ptr
addrspace(4) null to ptr) to i64)), i64 13)`
Thus, loop guard should not be needed.
---------
Signed-off-by: John Lu <John.Lu at amd.com>