[libc] Remove global printf_core StorageType declarations in float_inf_nan_converter.h (#196859)
fixed_converter.h and float_hex_converter.h have local declarations with
the same name shadowing it, causing -Wshadow warnings. The using
declaration is used in only one function, so just make it local.
[clang] Don't warn on __COUNTER__ in system macros
The introduction of extension and compatibility warnings means
that __COUNTER__ has started causing warnings (and -Werror= build
failures) due to use of system APIs.
This PR simply ensures that these diagnostics don't get reported
for system macro expansions as well.
[AMDGPU] Account for inline asm size in inst_pref_size calculation (#192306)
`SIProgramInfo::getFunctionCodeSize()` with `IsLowerBound=true` was
completely skipping inline assembly instructions, treating them as zero
bytes. This caused `amdhsa_inst_pref_size` to be severely underestimated
for kernels containing inline asm, defeating instruction prefetch on
gfx11+.
Use MCExpr label subtraction (`.Lfunc_end - func_sym`) to compute exact
function code size, resolved at assembly time. This avoids inline asm
string parsing which cannot reliably estimate code size and risks
overestimation (which causes prefetch of unmapped memory and a fatal
segfault).
Add a new `AMDGPUMCExpr` variant (`AGVK_InstPrefSize`) to compute
`min(divideCeil(codeSize, cacheLineSize), maxFieldVal)` as a custom
MCExpr, following the same pattern as `AGVK_Occupancy` and
`AGVK_AlignTo`. The cache line size and field width are derived from the
subtarget via `IsaInfo::getInstCacheLineSize` and feature-bit checks
[24 lines not shown]
[libc] Fix -Wshadow warning in sqrtf128.h (#196851)
sqrtf128() contained both `using namespace sqrtf128_internal;` and
`using FPBits = fputil::FPBits<float128>;`, but sqrtf128_internal also
had a `using FPBits = fputil::FPBits<float128>;`. The outer `using`
wasn't actually used, so remove that one.
[AMDGPU] Add `.amdgpu.info` section for per-function metadata (#192384)
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the
compiler must emit per-function metadata and call graph edges in the
relocatable object so the linker can compute whole-program resource
requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged,
length-prefixed binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags,
private segment size) and relational edges (direct calls, LDS uses,
indirect call signatures). String data such as function type signatures
[4 lines not shown]
[FileCheck] Fix -Wunused-variable in 48c864a (#197022)
Variable is only used in an assertion. Mark it maybe_unused rather than
inlining as the variable name makes it a bit more readable, even if it
is a common idiom.
[clangd] Fix out-of-bounds read in `packedLookup` (#197021)
The `packedLookup` function doesn't work correctly with byte values over
0x7F (e.g. UTF-8 high bytes) on platforms where `char` is signed (like
x86-64). The character is treated as a negative `char`, which gets
converted to a negative `int`, which makes `I >> 2` negative, which
gives a negative index, and thus an out-of-bounds read.
The fix is to change the `int` parameter type to `unsigned char`, to
always get the value in the 0x00..0xFF range.
The issue has been discovered by the sanitizer buildbot after the
#187623 merge, which first introduced tests with non-ASCII source
content into the code-completion path.
[lld][WebAssembly] Allow defining of arbitrary symbols types in LTO objects (#196552)
Bitcode files don't contains precise symbol type information so we
always allow the post-LTO defined symbols (from the LTO object file) to
overwrite bitcode symbols. We don't want to be reporting type mismatches
in these cases.
Fixes: #195311
[GlobalISel][AArch64] Add lowering for G_SMULFIX (#196757)
Adding lowering for G_SMULFIX G_OP. It is needed to compile
`libc/src/stdfix/expk.cpp` with `-O3`.
[lldb] Assert that CommandObject::DoExecute sets a return status (#196589)
Change the default value of CommandReturnObject::m_status from
eReturnStatusStarted to eReturnStatusInvalid, and add a debug-only RAII
check in CommandObjectParsed::Execute and CommandObjectRaw::Execute that
asserts the status is no longer Invalid after DoExecute returns.
This catches commands that forget to call SetStatus on a success or
failure path. Succeeded() still returns true when the status is Invalid
(0 sorts below eReturnStatusSuccessContinuingResult), so helpers that
read result.Succeeded() as a precondition before any explicit SetStatus
(e.g. StopProcessIfNecessary) continue to work.
rdar://176506732
DAGCombiner: (srl/sra (add nuw/nsw X, c), d) --> (add nuw/nsw (srl/sra X, d), c >> d)
Additional precondition:
* The LSBs of c are 0; equivalently: c >> d is exact
Alive2 for
* unsigned case: https://alive2.llvm.org/ce/z/YcJ8qA
* signed case: https://alive2.llvm.org/ce/z/fgpvyE
We already canonicalize (shl (add ...) ...) to (add (shl ...) ...).
Restrict this combine to the single-use case to minimize risk for now.
The main target of this combine is a fan-out tree of `add`s that all end
up being shifted by the same amount at the leaves. This change happens to
improve a bunch of existing CodeGen tests in AMDGPU.
v2:
- remove a redundant check on the shift amount -- large shift amounts
results in poison anyway
[2 lines not shown]
[FileCheck] Handle directives at EOF without a trailing newline (#196576)
FileCheck could assert when a check directive ended at EOF without a
trailing newline. After parsing the directive suffix, EOF can be a valid
continuation point, so parsing now continues directly from
`AfterSuffix`.
Fixes #101582
[RISCV] Check for null LIS before trying to move AVL in canMutatePriorConfig. (#196673)
If LIS is null then the VN info are null and we don't know if we need to
move the AVL.
Fixes an assertion like
RegAllocFast.cpp:729: void (anonymous
namespace)::RegAllocFastImpl::reloadAtBegin(MachineBasicBlock &):
Assertion `(&MBB != &MBB.getParent()->front() || IgnoreMissingDefs) &&
"no reload in start block. Missing vreg def?"' failed.