[AMDGPU] Replace hardcoded register class IDs with [[#]] (NFC)
Replace hardcoded register class ID numbers in INLINEASM check
patterns with FileCheck's [[#]] numeric substitution pattern.
This makes tests resilient to changes in TableGen-generated
register class IDs when new register classes are added.
[Clang] define memory scopes as a builtin enum
Clang currently represents memory scopes as pre-defined preprocessor macros that
evaluate to integers. But so far, there are three sets of conflicting scopes:
"common" clang scopes, HIP scopes and OpenCL scopes. These sets use the same
integers in different orders, making it impossible to validate their use. A
better approach is to represent these scopes as enum types, so that the integer
values become less significant. Sema can now validate the scope argument by its
type instead.
Both C and C++ define an enum for memory_order, but there is no standard enum
for memory_scope. This change introduces a Clang-specific enum "memory_scope".
The pre-defined macros are now mapped to this enum. Later changes can add
similar enums for other languages.
enum __memory_scope {
__memory_scope_system,
__memory_scope_device,
__memory_scope_workgroup,
[19 lines not shown]
[Clang][NFC] Introduce LanguageID::HIP_LANG and reclassify AtomicBuiltins
The TableGen class AtomicBuiltin is currently used for both OpenCL and HIP
atomic builtins, but there is no way to classify them. That class now takes
language as an argument. HIP is represented by a new enum member
LanguageID::HIP_LANG in this scheme.
Assisted-By: Claude Sonnet 4.5
[Clang][NFC] Use const ASTContext reference in Decl Create methods
Update Create() static factory methods, CreateDeserialized() methods, and
constructors in Decl.h to accept const ASTContext& instead of ASTContext&.
This change makes ASTContext parameters const-correct for declaration
creation and deserialization, affecting all Decl subclasses declared in
Decl.h.
Exceptions kept as non-const (only 3 methods):
- TranslationUnitDecl::Create() and constructor: stores non-const
ASTContext& member that is returned by getASTContext()
- DefaultedOrDeletedFunctionInfo::Create(): calls Context.Allocate()
which requires non-const access
Assisted-By: Claude Sonnect 4.5
[clang-tidy][NFC] Add missing Option tests in cppcoreguidelines and performance [3/N] (#185210)
This PR adds testcases for untested Options in `cppcoreguidelines` and
`performance` modules for better test coverage, specifically:
- `cppcoreguidelines-init-variables`: `IncludeStyle`, `MathHeader`.
- `cppcoreguidelines-pro-bounds-constant-array-index`: `IncludeStyle`.
- `performance-inefficient-string-concatenation`: `StrictMode`.
- `performance-no-automatic-move`: `AllowedTypes`.
- `performance-type-promotion-in-math-fn`: `IncludeStyle`.
- `performance-unnecessary-value-param`: `IncludeStyle`.
As of AI Usage: Assisted by Gemini 3 and Claude (Writing part of the
testcases and pre-commit reviewing).
[clang][test] Add missing FileCheck pipe in cxx20-module-directive.cpp (#185315)
The test had CHECK directives that were never executed because the RUN
line did not pipe output to FileCheck.
[LLVM][AArch64] Allow vector converts to run in streaming mode with … (#177375)
…FPRCVT
Vector Saturated converts fp->int and converts int->fp are now allowed
to run in streaming mode when FEAT_FPRCVT feature is available.
Therefore the patch replaces HasNEONandIsSME2p2StreamingSafe by
HasNEONandIsFPRCVTStreamingSafe following the latest update in [1] for
the Vector CVT instructions.
It also allows the ISD Node FP_TO_SINT_SAT to do custom lowering instead
of expand when it is a fixed lengh data type, because it can use the
Vector CVT instructions. I believe this is correct because there is
always a compatible CVT instruction for the SME core.
Because now the compiler allows the fixed length CVT to run in streaming
mode, I needed to fix the function LowerVectorFP_TO_INT_SAT, the FRINT
instruction were creating a ilegal size for CVT instructions that was
[4 lines not shown]
[mlir][gpu] Fix crash in gpu-to-llvm with unranked memref and bare-ptr calling convention (#185062)
When using `--gpu-to-llvm` with `use-bare-pointers-for-kernels=true` and
a `gpu.launch_func` whose kernel has an `UnrankedMemRefType` argument,
`LLVMTypeConverter::promoteOperands` would hit an `llvm_unreachable`
because unranked memrefs are not supported with the bare-pointer calling
convention.
Fix this by checking for unranked memref kernel arguments before calling
`promoteOperands` and returning a proper conversion failure with a
diagnostic instead of crashing.
Fixes #184939
Assisted-by: Claude Code
[InstCombine] Handle fixed-width results in get_active_lane_mask fold (#185317)
The optimization introduced in #183329 incorrectly assumed that any
extraction from a scalable active lane mask used a scalable index. When
the result of a `llvm.vector.extract` is a fixed-width vector, the index
should not be multiplied by vscale.
This PR adds a check to ensure the index is only scaled by VScaleMin
when the return type of the extraction is a scalable vector, not
fixed-width.
Fixes #185271
[mlir][linalg] Emit proper diagnostic instead of crashing in SelectOp with index type (#183652)
`buildTernaryFn` for `TernaryFn::select` called `llvm_unreachable` when
the operand types were not `i1`, integer, or floating-point (e.g.,
`index` type).
Instead delegate this checking to the IR verifier: we don't need to
duplicate the checks from the verifier in an assertion here.
Fixes #179046
Assisted-by: Claude Code
[clang][bytecode][NFC] Name all expressions E (#185379)
At least the ones we visit directly via the visitor. This was always the
case, except for BinaryOperator and CastExpr.
kaleidoscope: add missing check in the FunctionAST::codegen (#76322)
kaleidoscope chapter 03 explanation has this function redefine check,
but it was missing in the code sample.
Signed-off-by: amila <amila.15 at cse.mrt.ac.lk>
[NVPTX][NFC] Fix TODO comment style in NVPTXMCAsmInfo.cpp (#185332)
Replace `@TODO` with `TODO` to follow LLVM comment conventions.
This is an NFC change — no functional impact.
clang/AMDGPU: Fix workgroup size builtins for nonuniform work group sizes (#185098)
These were assuming uniform work group sizes. Emit the v4 and v5
sequences to take the remainder group for the nonuniform case.
Currently the device libs uses this builtin on the legacy ABI path with
the same sequence to calculate the remainder, and fully implements the
v5 path. If you perform a franken-build of the library with the updated
builtin, the result is worse. The duplicate sequence does not fully fold out.
However, it does not appear to be wrong. The relevant conformance tests still
pass.
[llvm][tools] Improve llvm-gpu-loader checks (#184791)
When devices are not properly initialized llvm-gpu-loader follows corrupt pointers which
result in hard to debug crashes.
This improves the checks to avoid such situations.
[PPC64][Linux] Watchpoint configuration for PPC64 (#185192)
On PPC64, SIGTRAP is delivered before the triggering instruction
completes. The previous implementation wrongly assumed that the
instruction gets executed and due to that qHostInfo has the wrong info,
causing the SingleStep being avoided on Linux for PPC64, and the
execution being stuck at the store instruction.
This patch corrects this, making watchpoint work.
[clang][ssaf] Add --ssaf-extract-summaries= and --ssaf-tu-summary-file= options (#184421)
Along the way, it also fixes the static registration of the builtin
(upstream) formats and extractors.
libclc: Remove old mesa amdgcn targets (#185385)
amdgcn-- was probably dead when clover was being maintained, since
it switched to using amdgcn-mesa-mesa3d. Also remove amdgcn-mesa-mesa3d,
since clover is no longer in mesa.
[flang][NFC] Converted five tests from old lowering to new lowering (part 27) (#185340)
Tests converted from test/Lower/Intrinsics: date_and_time.f90,
dconjg.f90, dim.f90, dimag.f90, dprod.f90