[libc][threads] adjust futex library and expose requeue API (#192478)
Make futex a common abstraction layer across platforms.
(linux/wasm/macOS/windows/fuchsia all have the support, which we can
align their support later on).
This patch also expose a requeue API that returns ENOSYS on unsupported
platforms. The requeue operation will be needed to reimplement a strict
FIFO style condvar similar to musl.
Additional cleanup is done to change raw syscall return value to
`ErrorOr<int>`.
Assisted-by: Codex with gpt-5.4 medium fast
[mlir][memref] Remove unit-stride restriction in SubViewOp folding (#192437)
This PR replaces manual offset/size resolution with `affine::mergeOffsetsSizesAndStrides`, simplifying the code and extending subview-of-subview folding to support non-unit strides.
[RISCV] Support MachineOutlinerRegSave for RISCV (#191351)
This patch adds support for the RegSave strategy in the RISC-V
MachineOutliner pass. It uses t1–t6 to preserve the t0 value across the
outlined function call when t0 is unavailable. This enables more
potential outlining candidates.
---------
Co-authored-by: Craig Topper <craig.topper at sifive.com>
[DAGCombiner] Extend convertBuildVecZextToZext to sign extends (#192372)
Generalize the existing fold that collapses a BUILD_VECTOR of ZERO_EXTEND
(or ANY_EXTEND) of EXTRACT_VECTOR_ELTs into a single vector extend so that
it also handles SIGN_EXTEND. Mixed sign and zero extends remain unsupported
because their high-bit semantics differ, so the combine bails out in that
case.
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply at anthropic.com>
[AMDGPU] Add `.amdgpu.info` section for per-function metadata
AMDGPU object linking requires the linker to propagate resource usage
(registers, stack, LDS) across translation units. To support this, the compiler
must emit per-function metadata and call graph edges in the relocatable object
so the linker can compute whole-program resource requirements.
This PR introduces a `.amdgpu.info` ELF section using a tagged, length-prefixed
binary format: each entry is encoded as:
```
[kind: u8] [len: u8] [payload: <len> bytes]
```
A function scope is opened by an `INFO_FUNC` entry (containing a symbol
reference), followed by per-function attributes (register counts, flags, private
segment size) and relational edges (direct calls, LDS uses, indirect call
signatures). String data such as function type signatures is stored in a
companion `.amdgpu.strtab` section.
[4 lines not shown]
[CIR][NFC] Upstream IR roundtrip tests for branch and loop ops (#189006)
Add `clang/test/CIR/IR` roundtrip tests for `cir.br`, `cir.brcond`,
`cir.for`, `cir.while`, and `cir.do`.
This adds parser/printer coverage for the textual forms of these
control-flow operations.
Partially addresses #156747.
[flang][cuda] Add missing pointer deallocation entry point (#192566)
We were missing the deallocation entry point for pointer and wiring all
to allocatable deallocate which will trigger Invalid descriptor error.
[NFC][OpenMP] Make map ordering tests for no host->tgt transfer more robust (#192571)
They were relying on the host value not being seen on the device, but
the value being matched was small enough for the probability of a
successful match against garbage data relatively high.
Now we just rely on the LIBOMPTARGET_DEBUG logs to ensure there wasn't
any transfer.
[libclc] Fix atomic_fetch_add/sub overloads for uintptr_t (#192570)
The overloads taking the memory order and/or scope parameters should
have the `_explicit` suffix, according to the OpenCL C specification.
[WebAssembly] Improve FP16 load and store generation. (#191274)
Previously, these LL instructions were expanded to software emulation
calls, causing performance overhead in benchmarks. By making these
operations legal and providing patterns, we can generate efficient code
using the new instructions.
[lldb] Add synthetic variable support to Get*VariableList.
This patch adds a new flag to the lldb_private::StackFrame API to get variable lists: `include_synthetic_vars`. This allows ScriptedFrame (and other future synthetic frames) to construct 'fake' variables and return them in the VariableList, so that commands like `fr v` and `SBFrame::GetVariables` can show them to the user as requested.
This patch includes all changes necessary to call the API the new way - I tried to use my best judgement on when to include synthetic variables or not and leave comments explaining the decision.
As a consequence of producing synthetic variables, this patch means that ScriptedFrame can produce Variable objects with ValueType that contains a ValueTypeExtendedMask in a high bit. This necessarily complicates some of the switch/case handling in places where we would expect to find such variables, and this patch makes best effort to address all such cases as well. From experience, they tend to show up whenever we're dealing with checking if a Variable is in a specified scope, which means we basically have to check the high bit against some user input saying "yes/no synthetic variables".
stack-info: PR: https://github.com/llvm/llvm-project/pull/181501, branch: users/bzcheeseman/stack/9
[lldb] Scaffolding for synthetic variable support. (#181500)
This patch handles most of the scaffolding for synthetic variable support that isn't directly tied to functional changes. This patch will be used by one following patch that actually modifies the lldb_private::StackFrame API to allow us to fetch synthetic variables.
There were a couple important/interesting decisions made in this patch that should be noted:
- Any value type may be synthetic, which is why it's a mask applied over the top of another value type.
- When printing frame variables with `fr v`, default to showing synthetic variables.
This new value type mask makes some of the ValueType handling more interesting, but since nothing generates objects with this mask until the next patch, we can land the concept in this patch in some amount of isolation.
[lldb] Add synthetic variable support to Get*VariableList.
This patch adds a new flag to the lldb_private::StackFrame API to get variable lists: `include_synthetic_vars`. This allows ScriptedFrame (and other future synthetic frames) to construct 'fake' variables and return them in the VariableList, so that commands like `fr v` and `SBFrame::GetVariables` can show them to the user as requested.
This patch includes all changes necessary to call the API the new way - I tried to use my best judgement on when to include synthetic variables or not and leave comments explaining the decision.
As a consequence of producing synthetic variables, this patch means that ScriptedFrame can produce Variable objects with ValueType that contains a ValueTypeExtendedMask in a high bit. This necessarily complicates some of the switch/case handling in places where we would expect to find such variables, and this patch makes best effort to address all such cases as well. From experience, they tend to show up whenever we're dealing with checking if a Variable is in a specified scope, which means we basically have to check the high bit against some user input saying "yes/no synthetic variables".
stack-info: PR: https://github.com/llvm/llvm-project/pull/181501, branch: users/bzcheeseman/stack/9
[lldb] Scaffolding for synthetic variable support.
This patch handles most of the scaffolding for synthetic variable support that isn't directly tied to functional changes. This patch will be used by one following patch that actually modifies the lldb_private::StackFrame API to allow us to fetch synthetic variables.
There were a couple important/interesting decisions made in this patch that should be noted:
- Any value type may be synthetic, which is why it's a mask applied over the top of another value type.
- When printing frame variables with `fr v`, default to showing synthetic variables.
This new value type mask makes some of the ValueType handling more interesting, but since nothing generates objects with this mask until the next patch, we can land the concept in this patch in some amount of isolation.
stack-info: PR: https://github.com/llvm/llvm-project/pull/181500, branch: users/bzcheeseman/stack/8
[LoopBoundSplit] Fix edge connections during transformation (#192106)
Fixed #190672.
The issue is caused by invalid intermediate IR when `getSCEV()` is
called during transformation: the exiting block of `pre-loop` did not
re-connect to preheader of the `post-loop`, causing `LI.verify()`
unable to correctly recompute another LoopInfo for verification.
To fix, reconnect the edge earlier before calling `getSCEV()`.
Also moved the DT updates to more appropriate places right after IR
control flow has changed. and added a few LI and DT verifications to
improve robustness of the pass.