[lldb] Iterate over a copy of the ModuleList in SearchFilter (#189009)
Avoid a potential deadlock caused by the search filter callback
acquiring the target's module lock by iterating over a copy of the list.
Fixes #188766
[AMDGPU] Add HWUI pressure heuristics to coexec strategy (#184929)
Adds basic support for new heuristics for the CoExecSchedStrategy.
InstructionFlavor provides a way to map instructions to different
"Flavors". These "Flavors" all have special scheduling considerations --
either they map to different HarwareUnits, or have unique scheduling
properties like fences.
HardwareUnitInfo provides a way to track and analyze the usage of some
hardware resource across the current scheduling region.
CandidateHeuristics holds the state for new heuristics, as well as the
implementations.
In addition, this adds new heuristics to use the various support pieces
listed above. tryCriticalResource attempts to schedule instructions that
use the most demanded HardwareUnit. If no such instructions are ready to
be scheduled, tryCriticalResourceDependency attempts to schedule
[4 lines not shown]
[MemProf] Dump inline call stacks as optimization remarks (#188678)
This patch teaches the MemProf matching pass to dump inline call
stacks as analysis remarks like so:
frame: 704e4117e6a62739 main:10:5
frame: 273929e54b9f1234 foo:2:12
inline call stack: 704e4117e6a62739,273929e54b9f1234
The output consists of two types of remarks:
- "frame": Acts as a dictionary mapping a unique MD5-based FrameID
to source information (function name, line offset, and column).
- "inline call stack": Provides the full call stack for a call site
as a sequence of FrameIDs.
Both types of remarks are deduplicated to reduce the output size.
This patch is intended to be a debugging aid.
AMDGPU: Match fract from compare and select and minimum
Implementing this with any of the minnum variants is overconstraining
for the actual use. Existing patterns use fmin, then have to manually
clamp nan inputs to get nan propagating behavior. It's cleaner to express
this with a nan propagating operation to start with.
AMDGPU: Match fract pattern with swapped edge case check
A fract implementation can equivalently be written as
r = fmin(x - floor(x))
r = isnan(x) ? x : r;
r = isinf(x) ? 0.0 : r;
or:
r = fmin(x - floor(x));
r = isinf(x) ? 0.0 : r;
r = isnan(x) ? x : r;
Previously this only matched the previous form. Match
the case where the isinf check is the inner clamp. There are
a few more ways to write this pattern (e.g., move the clamp of
infinity to the input) but I haven't encountered that in the wild.
The existing code seems to be trying too hard to match noncanonical
variants of the pattern. Only handles the result that all 4 permutations
of compare and select produce out of instcombine.
[compiler-rt] Add PTX feature specifically when CUDA is not available (#189083)
Summary:
People need to be able to build this without a CUDA installation.
Long term we should bump up the minimum version as I'm pretty sure every
architecture before this has been deprecated by NVIDIA.
[Scudo] Disable ScudoCombinedTests.NewType (#189070)
This is failing in some configurations on AArch64 Linux. Given there are
a lot of follow-up commits that makes this hard to revert, just disable
it for now pending future investigation.
[clang-format] Add pre-commit CI env var support to git-clang-format (#188816)
When git-clang-format is invoked with no explicit commit arguments and
both PRE_COMMIT_FROM_REF and PRE_COMMIT_TO_REF are set, the script
automatically uses those refs as the diff range and implies --diff. If
the variables are absent, existing behavior is fully preserved.
This allows projects to use `git-clang-format` directly inside CI
pipelines via the [pre-commit](https://pre-commit.com/) framework
without any wrapper scripts or extra configuration.
Closes: #188813
No existing lit test suite for this script. Verified manually that env
vars activate two-commit diff mode, existing behavior is preserved
without them, and explicit CLI args always override them.
Update clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
Replace "const char * const" with "llvm::StringLiteral"
Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
[OFFLOAD] Add spirv implementation for named barrier (#180393)
This change adds implementation for named barriers for SPIRV backend.
Since there is no built in API/intrinsics for named barrier in SPIRV,
the implementation loosely follows implementation for AMD
Update clang/lib/ScalableStaticAnalysisFramework/Analyses/UnsafeBufferUsage/UnsafeBufferUsage.cpp
adjust the position of the file title
Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
[AMDGPU][MC] Improving assembler error message for unsupported instructions (#185778)
The updated error message shows both the instruction name and the GPU
target name.
[mlir][OpenMP] Add iterator support to depend clause
Extend the depend clause to support `!omp.iterated<Ty>` handles
alongside plain depend vars, so the IR can represent both forms.
Assisted with copilot
[HLSL] Fix up Texture2D-mips-errors test
The Texture2D-mips-errors test was supposed to test for an error when the mips
types are used as templates. It was initially disabled because of a
crash. On further investigation, the crash was related to int2(0,0), and
not the mips type.
Follow-up issue for the int2(0,0) crash: #189086
Fixes #188556
[MLIR][TableGen] Fix ArrayRefParameter in struct format roundtrip (#189065)
When an ArrayRefParameter (or OptionalArrayRefParameter) appears in a
non-last position within a struct() assembly format directive, the
printed
output is ambiguous: the comma-separated array elements are
indistinguishable from the struct-level commas separating key-value
pairs.
Fix this by wrapping such parameters in square brackets in both the
generated printer and parser. The printer emits '[' before and ']' after
the array value; the parser calls parseLSquare()/parseRSquare() around
the
FieldParser call. Parameters with a custom printer or parser are
unaffected
(the user controls the format in that case).
Fixes #156623
Assisted-by: Claude Code
[mlir][OpenMP] Add iterator support to depend clause
Extend the depend clause to support `!omp.iterated<Ty>` handles
alongside plain depend vars, so the IR can represent both forms.
libclc: Simplify fract implementation
This is nan propagating, so it's unnatural to implement it
in terms of the nan avoiding fmin. Implement with compare and
select, which is the least constrained way to implement the clamp.
AMDGPU: Match fract from compare and select and minimum
Implementing this with any of the minnum variants is overconstraining
for the actual use. Existing patterns use fmin, then have to manually
clamp nan inputs to get nan propagating behavior. It's cleaner to express
this with a nan propagating operation to start with.
ValueTracking: x - floor(x) cannot introduce overflow
This returns a value with an absolute value less than 1 so it
should be possible to propagate no-infs.
AMDGPU: Match fract pattern with swapped edge case check
A fract implementation can equivalently be written as
r = fmin(x - floor(x))
r = isnan(x) ? x : r;
r = isinf(x) ? 0.0 : r;
or:
r = fmin(x - floor(x));
r = isinf(x) ? 0.0 : r;
r = isnan(x) ? x : r;
Previously this only matched the previous form. Match
the case where the isinf check is the inner clamp. There are
a few more ways to write this pattern (e.g., move the clamp of
infinity to the input) but I haven't encountered that in the wild.
The existing code seems to be trying too hard to match noncanonical
variants of the pattern. Only handles the result that all 4 permutations
of compare and select produce out of instcombine.