[BOLT][AArch64] Only reserve constant-island space when an island exists (#204261)
`tentativeLayout()` aligns every function's tail to its constant island
alignment even when the function has no constant island. This over-padded
nearly every function, drifting tentative layout from emitted layout non
trivially, while the emitter only pads when emitting real constant island.
Guard with `estimateConstantIslandSize() > 0` so tentative better matches
emitted.
[RISCV] Remove unused CHECK prefixes in `rvp-simd-32.ll` (#204383)
They were probably introduced by merge conflicts + UTC script changes.
---------
Co-authored-by: Craig Topper <craig.topper at sifive.com>
[OpenMP] Control KMP_CANCEL_THREADS via CMake and detect pthread_cancel (#193681)
KMP_CANCEL_THREADS was a preprocessor switch in kmp.h, with Android and
WASI explicitly #undef-ing it. Move the control to CMake:
- Detect pthread_cancel via check_symbol_exists()
- Drive KMP_CANCEL_THREADS from that result, emitted as a 0/1 #define in
kmp_config.h.cmake (overridable with -DLIBOMP_USE_CANCEL_THREADS=OFF).
- Drop the Android/WASM special-casing
[OpenMP] Improve dladdr error handling in ompd_init() (#201043)
Guard dlerror() result against NULL before passing to fprintf to avoid
confusing "(null)" output. Also guard dli_fname against NULL on the
success path before calling strrchr.
Assisted-by: Claude Sonnet 4.6
[lldb][test] Skip wasm-unsupported API tests (#204625)
WebAssembly inferiors are built with -fno-exceptions and run on a wasip1
runtime with no exec/fork/setpgid, no setjmp/longjmp (needs the
exception-handling proposal), no memory-protection faults/signals, and
no _Float16/__bf16 support. Mark the corresponding API tests as skipped
on wasm so they report unsupported instead of failing to build or run.
[Dexter] Switch to using script-mode by default
This patch changes the default mode of Dexter from heuristic-mode to
script-mode. The --use-script argument is replaced with --use-heuristic,
some comments/docs/error messages are updated accordingly, and tests have
their flags switched accordingly.
[Dexter] Add ability to check float values within a range
Adds a new node type, !float, which can be used to match debugger ouptut as
float values rather than as strings, optionally allowing a range to be
specified for inexact matches. This new node allows a list of values to be
given, effectively a shorthand for a list of individual !float nodes.
[Dexter] Allow matching lists of values for aggregate members
This patch slightly extends the matching of aggregate members to allow for
lists of expected values for individual members, functioning the same as
lists of expected values for scalar values.
[Dexter] Update lldb-based dexter-tests to use script-mode
This patch replaces uses of heuristic-mode Dexter in the dexter-tests suite
with uses of the script-mode, for tests that use DAP (via lldb-dap). The
updates are largely straightforward but occasionally non-trivial, and in
some cases some slight modifications have been made to keep the "spirit" of
the test intact.
[Dexter] Document the structured script model
This patch adds documentation for the script model to the Dexter README,
shunting heuristic-mode information into a separate doc, creating a new
doc for script-mode, and linking to both (with a brief summary of the
differences) from the base README.
[Dexter] Add support for writing !step values
Following from the previous patch, this patch adds support to Dexter for
generating expected values for !step nodes. This is relatively limited:
the kind of !step which this is most well-suited to this is !step exactly,
as the !step order of ignoring extra lines is redundant (all lines are added
as expected values), and !step never can't know what lines could have been
stepped on but weren't without some extra work (e.g. finding viable
breakpoint locations in the enclosing state node).
[Dexter] Add at_frame_idx to check values in frames above current
This patch adds a new attribute for !and nodes, `at_frame_idx`, which
matches against frames above its parent node; for example, in the script:
```
!where {function: foo}:
!where {function: bar}:
!and {at_frame_idx: 1}:
!value x: 0
```
The `!value x` node checks the value of 'x' in 'foo' while the debugger is
inside 'bar'. Use of this attribute comes with some restrictions: a !where
node can never be nested under a !and{at_frame_idx} node, and neither can
another !and{at_frame_idx} node.
[Dexter] Enable after_hit_count for state nodes
The after_hit_count attribute for a state node causes it to become active
only after it would have become active N times. This uses the existing logic
for incrementing hit counts, i.e. after the node becomes "active", we will
not add another hit count until it stops being active for at least one step.
Since state nodes with after_hit_count do not become active before reaching
the required hit count, this requires us to keep track of an "early" set of
state nodes, meaning nodes that would be active if not for their
after_hit_count.
[Dexter] Add !step node for testing stepping behaviour
This patch adds a node for generating metrics based on lines stepped on. The
new node has 3 versions: !step exactly, !step order, and !step never, which
check an expected list of line numbers against the actual line numbers seen
while the expect is active.
[Dexter] Add !type and !type/all nodes to test variable types
This patch adds the second kind of variable expect, !type, which tests the
type of a variable as reported by the debugger. As with !value, this is a
string comparison of the debugger output with the script expected value -
this means that even if two types are identical (e.g. typedef), a !type node
will only match the one that the debugger displays by default.
Script writing and aggregates work the same for !type as for !value, and the
metrics reported are largely similar, with the exception that "unexpected",
"seen", and "missing" metrics are reported separately for values and types.
[Dexter] Add condition check to state nodes
This patch enables the ability for state nodes to check conditions, meaning
they will be active only if the condition is met.
Condition evaluation is somewhat language specific; we directly check
whether the value of the evaluated expression is "true" (case-insensitive),
which works for the languages we actually use Dexter with, but may require
generalizing in future.
We also cache conditions as they are evaluated; each time we step, we clear
all cached conditions for the current frame and any expired frames, but we
keep the cached conditions for any frames rootwards from the current frame;
this prevents us from unexpectedly exiting out of a callee frame because of
debug info not surviving a stack unwind; if the early exit is desired, an
!and{at_frame_idx, condition} under the lower frame may suffice.
[HIP] Remove default `-flto-partitions=8` in the HIP toolchain (#203948)
Summary:
This was added and made it into a release, but it never should've been a
default argument. Partitioning the LTO is a fundamentally different
compilation model and has real impacts on the generated code. Right now
it is added silently, which breaks non-Hostcall printf and degreades
performance due to split uselists.
This is a contract that should not be made default. "Compile times" is
not a justification to silently change compilation semantics, that is
the user's build system's job. Parititioning to a magic number is not an
appropriate solution when passing -flto-partitions=8 or `-Xarch_device
-flto-partitions=8` is perfectly viable and not hidden from the user.
This resolves the 12% performance regression observed when switching to
the LTO toolchain in HIP for dcsrgemm.
[lit] Migrate lit to ProcessPoolExecutor (#202681)
This PR is a foundational refactor for the lit single-process
re-architecture.
It migrates test execution from `multiprocessing.Pool` to
`concurrent.futures.ProcessPoolExecutor`. While the process model
remains unchanged (this is purely correctness and API modernization with
no behavior change on a passing suite), this migration establishes the
`concurrent.futures` API foundation required to introduce a
`ThreadPoolExecutor` backend in future PRs.
By collecting results with `as_completed` via an explicit `{future:
test}` map, this refactor also fixes two latent bugs:
1. **Stale timeout bug**: The per-iteration timeout budget was
previously computed once and reused. It is now correctly anchored to an
absolute deadline.
2. **Submission-order coupling**: Results are now safely routed by
future identity rather than submission index.
Signed-off-by: Prasoon Kumar <prasoonkumar054 at gmail.com>
[AArch64] Combine undef UZP and NVCAST away.
These are used to lower insert_subvec nodes quite early in SDAG. After
DAG combines run, it's possible that the inputs to these AArch64 nodes
become UNDEF.
Added a check for unsigned integer before accessing (#204276)
Possible fix for #203862
Unsigned value is assumed but signed is possible. So added a check if it
is unsigned before accessing as unsigned, otherwise access as integer
and typecast to unsigned.
[LLVM] Fix a bug in auto upgrading lifetime start/end intrinsics (#204601)
When creating the new intrinsic declaration, use the correct pointer
argument (arg #1) from the existing call. Currently, we use arg #0
(size) and end up creating an invalid intrinsic declaration. However,
later on we do not use this declaration directly and instead call
`CreateLifetimeStart` or `CreateLifetimeEnd` IRBuilder functions that
end up creating valid intrinsic declarations. The net result is that we
are left with a stray unused invalid declaration.
Fix this issue by creating the intrinsic with the right pointer argument
type.