clang: Move __builtin_amdgcn_processor_is diagnostic test to sema
This wasn't checking the codegen result, so move it to the right place
and use -verify instead of FileChecking stderr.
Co-authored-by: Claude (Opus 4.8) <noreply at anthropic.com>
[clang][bytecode] Pass AccessKinds to Check{Constant,Mutable} (#205720)
So we can pass them on do `diagnoseNonConstVariable`.
This doesn't make a difference right now but is needed for a future
commit.
[Flang][Driver]Add support for option '-fpseudo-probe-for-profiling' in flang (#205046)
Added support for option `-fpseudo-probe-for-profiling` in flang.
- When the option `-fpseudo-probe-for-profiling` is passed, the compiler
sets the` PseudoProbeForProfiling` flag and triggers the
`SampleProfileProbePass`. This pass inserts `llvm.pseudoprobe(..)`
intrinsic calls and `!llvm.pseudo_probe_desc` metadata into the IR.
[flang][OpenMP] Fix declare reduction lookup for USE...ONLY imports
CheckSymbolSupportsType walked every module in the global scope to find
declare-reduction declarations. That accepted reductions from modules
that were never USE'd, or were excluded via USE...ONLY, and it still
rejected some valid imports such as a renamed operator.
Replace the global scan with FindUserReduction(), which resolves the
reduction the way name resolution resolves the operator. It checks a
directly visible reduction first, then follows the operator's USE
associations and merged-generic sources to the declaring modules,
re-deriving the source module's mangled name for renamed operators. The
search recurses through re-exporting (facade) modules and is type-aware,
so an operator that carries reductions for several types resolves to the
one supporting the requested type. A locally declared reduction is
authoritative and shadows reductions reachable through the operator.
Consolidate the duplicated GetReductionFortranId() (formerly static in
both resolve-names.cpp and mod-file.cpp) into a shared utility, fixing a
[11 lines not shown]
clang/AMDGPU: Simplify cpu name checks for __builtin_amdgcn_is_processor
Instead of trying to figure out which TargetInfo to use, skip it and
directly use the source of truth from TargetParser. This avoids regressions
in future commits where isValidCPUName will be conditionally filtered.
[NewPM][AArch64] Port AArch64SRLTDefineSuperRegs pass to NewPassManager (#202803)
Standard port for the AArch64SRLTDefineSuperRegs pass.
Assisted by Gemini
[WebAssembly] Represent reference types as TargetExtType (#203165)
Originally #71540 by Paolo Matos, I picked it up and finished it.
Resolves https://github.com/llvm/llvm-project/issues/69894.
Model WebAssembly externref and funcref as target("wasm.externref") /
target("wasm.funcref") TargetExtTypes instead of pointers in
non-integral address spaces 10 and 20.
The entire WebAssemblyLowerRefTypesIntPtrConv can be removed.
This breaks the GlobalISel handling for reference types, I just disabled
GlobalISel handling for functions that use them.
I added intrinsics for `wasm.ptr.to_funcref` and `wasm.funcref.to_ptr`.
ptr.to_funcref does a table.get from the indirect function pointer
table. As a special case, 0 is converted to the null funcref rather than
doing table.get on 0. `wasm.funcref.to_ptr` is only handled when we call
it immediately, otherwise it will fail to lower. We could dynamically
[13 lines not shown]
[clang-doc] Test more language constructs (#205585)
We're missing several different language constructs in our tests. This
patch simply adds the basic tests and captures the output without trying
to fix or adjust any behavior, and can be considered a sort of precommit
test for future fixes to the various documentation components.
[clang][bytecode] Ignore indeterminate APValues (#205555)
They don't produce a value and for us, that means we just need to ignore
them and not initialize anything.
[flang][cuda] Accept cuf kernel do without scalar (#205705)
The base compiler accept `!$cuf kernel do()` instead of raising an
error. Update the parser to accept the same syntax.
`!$cuf kernel do()` is equivalent to `!$cuf kernel do`
[AArch64] Add missing SubtargetFeature for hip12 core (#205246)
The initial patch for the hip12 core had omitted several subtarget
features:
FeatureFP16FML, FeatureFlagM, FeaturePredRes, FeatureSB, FeatureSSBS,
FeatureCCIDX, FeatureRandGen.
[SampleProfileMatcher] Sample profile duplication to avoid stale CFG profile matching conflicts (#202460)
Stale profile matching may map multiple different IR anchors into one
profile anchor because of the common function basename. One example is
`foo(int)` and `foo<bar>(float)` can both be mapped to `foo()` if
`foo()` is the only function that has a profile. And this creates
conflicting CFG matching for `foo(int)` and `foo<bar>(float)` when they
each runs stale profile matching. The CFG matching results will be
overwritten among the conflicting functions. And it will trigger the
following assertation failure:
https://github.com/llvm/llvm-project/blob/7087094b05a1bba64a99474cc501328919e11b4a/llvm/lib/Transforms/IPO/SampleProfileMatcher.cpp#L332-L333
This patch tries to detect this conflict during the stale CG matching,
and create duplicated profiles to avoid CFG matching conflicts.
Fix ProcessElfCore::FindModuleUUID() so it work with symlinks. (#205235)
ProcessElfCore was reading the NT_FILE list and using that to help
FindModuleUUID to provide UUID information when loading core files. The
NT_FILE list contains resolved paths only, while the
DynamicLoaderPOSIXDYLD plug-in was using paths found in the r_debug
structure which contains a linked list of all of the shared libraries in
a process. The issue was these paths could be symlinks which would cause
ProcessELFCore::FindModuleUUID(...) to fail because the paths wouldn't
match up. This led to the ProcessELFCore often not being able to provide
UUIDs for shared libraries and cause the incorrect binaries to be loaded
from the current machine even when the shared library UUIDs don't match.
The solution was to add the ability for a ModuleSpec to contain a load
address for the shared library. This allows ProcessELFCore to uniquely
identify a library regardless of the name used in NT_FILE. We can now
correctly supply the UUID from the .gnu-build-id to any binaries which
use symlinks when linking, but have differing resolved paths to the
libraries.
[13 lines not shown]
[Clang] Transform SubstNonTypeTemplateParmExpr replacements in a constant-evaluated context (#196791)
Fixes #175831.
When transforming a `SubstNonTypeTemplateParmExpr`,
`TreeTransform::TransformSubstNonTypeTemplateParmExpr` calls
`Sema::CheckTemplateArgument` so that any sema annotations (such as
implicit casts) that were stripped from the replacement are recovered.
This is done in whatever evaluation context the node happens to appear
in after substitutions.
Since the normalization of constraints, a `SubstNonTypeTemplateParmExpr`
can end up inside an unevaluated operand, so the replacement gets
rebuilt in an unevaluated context.
Entities it refers to are then not odr-used: for example, when a call
materializes a by-value function parameter of class type, the copy
constructor is never marked odr-used and its definition is never
instantiated.
The constant evaluation performed by `CheckTemplateArgument` afterwards
[11 lines not shown]
[Clang] Fixed an assertion in constant evaluation when using a defaulted comparison operator in a union (#198830)
Fixes an assertion failure by decoupling `IsTrivialMemoryOperation` from
assignment operators.
fix #147127
[libc][complex] Add cargf and carg functions to libc complex math (#204087)
This PR adds carg and cargf function to libc complex and also add test
cases to cover some special inputs.
---------
Signed-off-by: jinge90 <ge.jin at intel.com>
[MLIR][XeGPU] Fix order remapping in layout transpose (#205212)
LayoutAttr::transposeDims and LayoutAttr::isTransposeOf mishandled the
`order` field when transposing a layout. The `order` field is
fundamentally different from the size-valued fields (sg_layout, sg_data,
inst_data, lane_layout, lane_data): its values are dimension indices
(order[0] is the fastest-varying dim), not per-position sizes. The two
require different transpose rules:
- Size fields — reindex by position: new[i] = orig[perm[i]]
- order — relabel values through the inverse permutation: newOrder[i] =
inversePerm[origOrder[i]]
Both functions incorrectly applied the size-field rule to `order`.
Because the bug was applied consistently in both places, it stayed
hidden for trivial/symmetric (e.g. 2D [1,0]) permutations, where the two
rules happen to coincide. It only surfaces for non-trivial permutations
such as the 3D [1,0,2] produced by a broadcast→transpose chain.
Assist-by-Claude
[3 lines not shown]
[RISCV] Emit .option arch extensions without the "experimental-" prefix (#205471)
We currently emit the "experimental-" prefix in .option arch, e.g.
`.option arch, +experimental-zicfiss`, but the assembler can't parse
that back.
There are two ways to fix this:
1. Teach the assembler to accept `.option arch, +experimental-zicfiss`.
2. Emit `.option arch, +zicfiss` instead of `.option arch,
+experimental-zicfiss`.
This patch takes the second approach, which better fits the .option arch
syntax we defined. Experimental extensions are still guarded by
`-menable-experimental-extensions`.
[DAG] Fix illegal type in srl(bitcast(build_vector)) fold (#205074)
The fold
```
(srl (bitcast (build_vector e1, ..., eN)), (N-1) * eltsize) -> (zext eN)
```
added in #181412 built the result through a narrow element integer type,
which
can be illegal (e.g. i16 on RV32 with the P extension, where `<2 x i16>`
is
legal). When the fold runs in the last DAG combine that illegal type
hits the
"Unexpected illegal type!" assert.
Build the result directly in the result type `VT` and mask off the high
bits
instead:
[13 lines not shown]
[libclc] Delete wrong implementation nvptx clc_isinf (#205699)
The file calls __nv_isinf which return 1 for true on vector input, while
the generic clc_isinf which return -1 for true on vector input. Using
nvptx clc_isinf in OpenCL isinf violates OpenCL spec.
Found the issue in https://github.com/intel/llvm/pull/22413
[x86] Handle implicit sections when determining if a global is large (#204247)
Just like explicit sections.
We were seeing globals with implicit sections marked large under the
medium code model.
Assisted-by: Gemini
[libc][stat] Move internal statx type definition into OSUtil/linux (#203975)
This PR refactors the internally defined `statx` buffer to a shareable
location so other LLVM-libc linux entrypoints may call `statx` without
concern for name conflicts around `linux/stat.h`.
Specifically, this PR moves `libc/src/sys/stat/linux/kernel_statx.h` to
`libc/src/__support/OSUtil/linux/stat/` and splits it into two files,
`kernel_statx_types.h` + `stat_via_statx.h`.
This will be used by `realpath`.
[Instrumentor] Add runtime examples: [1/N] A flop counter
This adds a instrumentor-tools folder into compiler RT to showcase
use cases of the instrumentor. The initial example is a program that,
via instrumentation, counts the number of flops performed. Call and
intrinsic support will follow after #198042.
Partially developped by Claude (AI), tested and verified by me.
[Instrumentor] Add runtime examples: [1/N] A flop counter
This adds a instrumentor-tools folder into compiler RT to showcase
use cases of the instrumentor. The initial example is a program that,
via instrumentation, counts the number of flops performed. Call and
intrinsic support will follow after #198042.
Partially developped by Claude (AI), tested and verified by me.
[Driver][SYCL] Treat stdin as C++ when -fsycl is active (#204968)
1723b7a30145 added a frontend check that rejects C inputs when SYCL mode
is active (since SYCL requires C++). The stdin path in BuildInputs
hardcoded TY_C regardless of driver mode, so `-fsycl -dM -E -` would
pass -x c to cc1 and trigger the new diagnostic.
Fix: use TY_CXX for stdin when IsSYCL.
Also, upstream a downstream test that fails due to 1723b7a30145.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply at anthropic.com>