[libc++] Finish converting __wrap_iter::operator<,== to C++20 (#193287)
+ operator< was overlooked in #179590
+ operator< was not marked constexpr in C++11
+ operator== should be defaulted when possible in C++20
Fixes #193283
[flang][hlfir] Extend InlineHLFIRCopy to inline copy_out with copy-back
Rename `InlineHLFIRCopyIn` to `InlineHLFIRCopy` and extend it to inline
the paired `hlfir.copy_out` operation. The copy_out is inlined at its
original location, after the call, ensuring proper ordering of copy-back
and deallocation.
Only inlines when no copy-back is required (intent(in)); intent(inout/out)
pairs are left untransformed.
Based on https://github.com/llvm/llvm-project/pull/179096.
Co-Authored-By: Kazuaki Matsumura <kmatsumura at nvidia.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply at anthropic.com>
[Clang][test] Add inferred nofree to attr-counted-by-issue200014.c checks (#202491)
Commit 89905ff21441 ("[FunctionAttrs] Add support for nofree argument
inference", #201591) infers a `nofree` parameter attribute at `-O2`. The
test `attr-counted-by-issue200014.c` (added later in #201161) was
generated before that change, so its `O2-SAME` lines omit `nofree` and
currently fail on `main`:
```
O2-SAME: ptr noundef readonly captures(none) ... (expected by test)
ptr nofree noundef readonly captures(none) ... (actual codegen)
```
This regenerates the checks with `update_cc_test_checks.py`. Test-only,
NFC.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-authored-by: Claude Opus 4.8 (1M context) <noreply at anthropic.com>
[clang][ExprConst] Remove `State::getBottomFrame()` (#202277)
This is not necessary since `Frame` already has a `getCaller()`
function, which can be used to identify the bottom frame.
And the current code never needs the bottom frame for anything other
than checking if another frame is the bottom frame.
[OpenMP] Mark critical region lock variables as dso_local (#201157)
OpenMP named critical regions use lock variables of the form
.gomp_critical_user_<name>.var, which are created through
CGOpenMPRuntime::getCriticalRegionLock().
These variables are created via
OpenMPIRBuilder::getOrCreateInternalVariable() and bypass the normal
CodeGenModule::setDSOLocal() path used for other Clang-generated
globals. As a result, OpenMP critical lock variables do not receive the
usual frontend dso_local inference.
Apply CodeGenModule::setDSOLocal() to critical lock variables after
creation. This matches the existing frontend
dso_local inference logic.
On ELF targets with a static relocation model, this results in direct
accesses to the lock variable instead of GOT-based accesses. For
example, x86-64 code generation changes from R_X86_64_REX_GOTPCRELX
[4 lines not shown]
[RISCV] Don't require specific extensions to use altfmt with vset(i)vli. (#202458)
The list of extensions that use altfmt is increasing and is becoming
unsustainable. The likelihood of the bit being repurposed is decreasing
with each new use. Let's just allow it unconditionally.
There is some risk here since all of these extensions are still
unratified and experimental so it could be that all of these extensions
get redefined so that this bit doesn't become altfmt. But I think that's
unlikely.
[libc++][span][test] Various cleanups for <span> tests (#202319)
This patch does:
- Simplify some test implementations
- Polish comments and synopsis
---------
Co-authored-by: A. Jiang <de34 at live.cn>
[BPF] Emit ABI-accurate BTF prototypes for DW_CC_nocall (#198426)
DW_CC_nocall subprograms can end up with an optimized IR signature that no
longer matches the original source-level DISubroutineType. Dead argument
elimination may drop source parameters, and the return value may be
removed entirely, while the debug type still describes the original
prototype. In that case BTFDebug emits a FUNC_PROTO that no longer
matches the real BPF ABI.
Teach BTFDebug to derive a filtered FUNC_PROTO for nocall functions.
Detecting surviving arguments (collectNocallEntryArgRegs):
Scan all DBG_VALUE instructions in the entry block while tracking which
registers have been redefined by non-debug instructions:
- A DBG_VALUE whose register has not been redefined records a
register-passed argument (R1-R5 at function entry).
- A DBG_VALUE whose register was most recently loaded via LDD $r11,
[35 lines not shown]
Revert "[NVPTX] Support lowering of `(l)lround`" (#202500)
Reverts llvm/llvm-project#183901
Looks like using removeFromUseLists from Transforms doesn’t work in
certain configurations.
[LLDB] Fix DW_OP_implicit_value GetOpcodeDataSize() error (#201344)
LLDB does not handle `DW_OP_implicit_value` right, causing a "cannot get
opcode data size for Unknown DW_OP constant" error when LLDB parses
location expressions containing this opcode.
`DW_OP_implicit_value` takes two operands: a ULEB128-encoded length
followed by a byte sequence of that length. The current
`GetOpcodeDataSize` implementation has no case for this opcode, only
skip. This prevents LLDB from correctly determining opcode boundaries
when scanning multi-operation location expressions.
From DWARFv5
> The DW_OP_implicit_value operation specifies an immediate value using
two operands: an unsigned LEB128 length, followed by a sequence of bytes
of the given length that contain the value.
Although the evaluation path (`DWARFExpression::Evaluate`) handles this
opcode correctly and produces the right result, the validation/parsing
path emits a confusing error message to the user.
[NVPTX] Support lowering of `(l)lround` (#183901)
These intrinsics should have the same semantics as libm `round`, but
with an integer return type. Nits appreciated.
Fixes #182378
Reland [RISCV] Prefer SP over FP for frame index access when offset fits within compressed immediate range. (#201541)
Before this change, we would use fp/s0/x8 for most stack accesses when
frame pointers were present. This is an over-approximation when a
stack slot is reachable from both SP and FP with no scalable offset.
This patch replaces the unconditional getFrameRegister() call in
getFrameIndexReference with an explicit register selection decision
tree.
When both SP and FP are available (no stack realignment, no
variable-sized objects), prefer SP if the SP-relative offset fits in
the compressed instruction immediate range (<=252 for RV32, <=504 for
RV64). This enables compression for sp-relative instructions to
c.swsp/c.lwsp (RV32) and c.sdsp/c.ldsp (RV64) thereby reducing code
size.
The SP preference is guarded by hasReservedCallFrame(MF) to ensure SP
is stable throughout the function body. This is necessary because
[9 lines not shown]
Reland HIP offload PGO compiler support and link the device-profile runtime (#201607)
This mostly relands the compiler part of #177665 (approved and merged,
then reverted in #201416). The first commit restores it as merged: the
AMDGPU instrumentation in LLVM and the HIP codegen in Clang.
#177665 was reverted because of a Windows CRT problem, fixed by
splitting the ROCm runtime into a separate library clang_rt.profile_rocm
(see the compiler-rt PR). The second commit links that library on the
host for HIP device PGO, in addOffloadRTLibs for the Linux and MSVC
toolchains, gated on HIP + profiling + the library being present. It is
a superset of clang_rt.profile and is linked first, so the base library
stays inert. Non-HIP links are unaffected.
Depends on the compiler-rt PR that adds clang_rt.profile_rocm.
[HLSL] Set visibility of cbuffer global variables to internal (#200312)
Global variables for all resources except `cbuffer` are already emitted
with internal linkage (since #166844). This change adds internal linkage
to the `cbuffer` handle globals as well.
One problem is that the `cbuffer` handle globals appears unused between
Clang CodeGen and `{DXIL|SPIRV}CBufferAccess` pass, which replaces
individual `cbuffer` constant globals with accesses through the
`cbuffer` handle globals. Before this pass runs, the unused globals
could get optimized away in `GlobalOptPass` with `-O3`.
To solve this, the `cbuffer` handle globals are added to the
`@llvm.compiler.used` list to make sure they stay in the module until
the `{DXIL|SPRIV}CBufferAccess` pass, which then removes them from the
list.
Reland "[clang-tidy] Preserve line endings in macro-to-enum fixes" (#202271)
Use StringRef::detectEOL() when inserting enum braces so fix-its do not
mix LF into CRLF source files.
This reland fixes the previous buildbot failure by adding `--` in test
file.
[rtsan][clang] Add Hexagon support for RTSan (#200313)
Enable RTSan for the Hexagon architecture.
* Add Hexagon to ALL_RTSAN_SUPPORTED_ARCH in cmake
* Add a clang driver test for hexagon-unknown-linux-musl
* Guarding a static_assert(sizeof(unsigned long) >= sizeof(off_t)) with
SANITIZER_WORDSIZE >= 64, since off_t syscall args are split into two
regs.
[test][Support] Disable CFI-icall for DynamicLibrary Overload test (#202446)
The test performs manual symbol lookup and calls, which triggers
Control Flow Integrity indirect call checks.
[Clang][counted_by] Honor counted_by in __bdos on direct struct access (#201161)
__builtin_dynamic_object_size on a flexible array member must consult
the 'counted_by' attribute even when the containing struct is accessed
directly (a local or global variable) rather than through a pointer
dereference. The pointer-deref form (p->fam) already worked because the
constant evaluator could not determine the LValue for an opaque
parameter and fell through to the counted_by-aware runtime path in
CGBuiltin. The direct form (af.fam, gaf.fam) was being folded by
tryEvaluateBuiltinObjectSize to a layout-derived size (e.g. trailing
struct padding for locals, trailing initializer data for globals)
silently bypassing emitCountedBySize.
Make the AST constant evaluator refuse to fold __bdos on the same
operands that CGBuiltin's __bdos lowering classifies as a counted_by
FAM access. The check runs after the existing negative-offset early
return so that obviously out-of-bounds operands like &p->array[-42]
still fold to 0, preserving the behavior the sanitizer-bounds test in
attr-counted-by.c (test35) relies on.
[25 lines not shown]
[clang-cl] Fix friend class warning on Windows (#201720)
clang-cl warned on "friend class CallInst;" because MSVC may resolve
that to "friend llvm::CallInst" instead of the sbox IR mirrored
hierarchy. Drop the class tag and refer to forward declared names
instead.