[SPIRV] Add bitreverse expansion for kernel (#186412)
The OpBitReverse is available when Shader or SPV_KHR_bit_instructions
extension is enabled. For targets without these capabilities, introduce
software emulation of G_BITREVERSE based on the parallel bit reversal
algorithm:
https://graphics.stanford.edu/~seander/bithacks.html#ReverseParallel
The emulation supports 8/16/32/64-bit scalars and vectors using bitwise
operations (shifts, AND, OR). A helper lambda avoids undefined behavior
when computing masks for 64-bit types.
Tests added for both emulation and native paths across all supported
types.
Assisted-by: Claude Code
libclc: Partially implement nonuniform subgroup reduce functions (#188929)
For AMDGPU these are identical to the uniform case. Stub out the missing
cases with traps to avoid test failures from undefined symbols while
keeping the structure consistent.
[mlir][spirv][gpu] Add lowering for gpu.subgroup_broadcast (#187947)
Add lowering for `gpu.subgroup_broadcast` and
`gpu.subgroup_broadcast_first` to `spirv.GroupNonUniformBroadcast` and
`spirv.GroupNonUniformBroadcastFirst`.
Fixes #157940
PatternMatch: Add matchers for positive or negative infinity
The existing m_Inf deceptively matches both positive and negative
infinities. Add variants that match the specific sign.
[clang-tidy] Add missing #include insertion in macros for modernize-use-std-print (#188394)
Follow-up of: #188247
---------
Co-authored-by: Victor Chernyakin <chernyakin.victor.j at outlook.com>
libclc: Partially implement nonuniform subgroup reduce functions
For AMDGPU these are identical to the uniform case. Stub out the missing
cases with traps to avoid test failures from undefined symbols while keeping
the structure consistent.
Support Serializing/Deserializing Extended Generic Selection Expressions
Clang supports using a type as the predicate for generic selection
expressions but lacked support for serializing/deserializing these
extended generic selection expressions.
Signed-off-by: Will Hawkins <hawkinsw at obs.cr>
[lldb] use the Py_REFCNT() macro instead of directly accessing member (#188161)
[PyObject members are not to be accessed
directly](https://docs.python.org/3/c-api/structures.html#c.PyObject),
but rather through macros, in this case `Py_REFCNT()`.
In most, ie Global Interpreter Lock-enabled, CPython cases,
`Py_REFCNT()` expands to accessing `ob_refcnt` anyway. However, in a
free-threaded CPython, combined with disabling the limited API (since it
requires the GIL for now), the direct member does not exist, causing the
build to fail. The macro expands to the correct access method in the
free-threaded configuration.
(cherry picked from commit 2a7b0f06d2060dbab8fa38fae7689f2d9048fa9d)
[llvm] Attempt to re-enable llvm-debuginfod-find test on Windows bots (#188810)
Next attempt after https://github.com/llvm/llvm-project/pull/187753 to
record headers for HTTP requests from llvm-debuginfod-find in Python on
the Windows bots. It has always worked for me locally, but not on the
bots. This time we skip the no-headers test case to check if we get
output in the actual headers case.
[LoongArch] Fix incorrect indexing of incoming byval arguments in tail call eligibility check (#188006)
The loop that validates byval arguments in
`isEligibleForTailCallOptimization()` incorrectly used the loop index
`i` when accessing `getIncomingByValArgs()`, even though `j` is the
index tracking the number of encountered byval arguments.
This mismatch could lead to out-of-bounds access or incorrect type
comparisons when non-byval arguments are interleaved with byval
arguments, causing the tail call eligibility check to fail or behave
incorrectly.
Fix this by using `j` consistently as the index into the incoming byval
argument list and only incrementing it after the bounds check.
Fixes #187832
(cherry picked from commit ab17b5408ac83a03807b6f0ea22f51dfb84b0b8a)
[clang][AST] Preserve qualifiers in getFullyQualifiedType for AutoType (#187717)
A previous change (86c4e96) did not preserve qualifiers attached to the
AutoType QualType when the type was deduced.
For an AutoType after `getDeducedType()`, qualifiers from the original
QualType were dropped. Preserve and reapply them to the deduced type.
(cherry picked from commit 1f9c54a15a87f72ca45fb47ec006d1eae63f4eb0)