LLVM/project 486370cclang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/Sema SemaAMDGPU.cpp

[AMDGPU][Clang] refactor addrspace and scope checks [NFC] (#199175)

Assisted-By: Claude Opus 4.6
DeltaFile
+30-23clang/lib/Sema/SemaAMDGPU.cpp
+1-1clang/test/SemaOpenCL/builtins-amdgcn-error-gfx1250-cooperative-atomics.cl
+1-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+32-253 files

LLVM/project 423d126llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll accvgpr-spill-scc-clobber.mir

Merge branch 'main' into revert-197745-jn/clang-avoid-casts
DeltaFile
+7,612-6,640llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+8,268-12llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+2,501-2,502llvm/test/CodeGen/AMDGPU/gfx-callable-argument-types.ll
+2,151-2,154llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+1,981-1,979llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+1,802-1,805llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+24,315-15,0921,577 files not shown
+100,185-37,5691,583 files

LLVM/project 191aa29llvm/docs LangRef.rst

[LangRef] Do not allow free via synchronization in nofree (#195658)

The nofree attribute is currently specified to only forbid direct free
calls inside the function. A nofree function is still allowed to compel
a pointer to be freed by a different thread through synchronization.

This is currently only spelled out for the function-level nofree
attribute, but I assume the same semantics also hold for argument nofree
(and this matches how the Attributor implementation infers it).

The original motivation for this definition was to keep the attributes
orthogonal and independently inferable. However, the problem is that
nosync is a too strong condition: It excludes *any* synchronization, not
just synchronization that results in the free of a pointer.

Some frontends like Rust can guarantee that most pointer arguments
cannot be freed for the duration of a function call, including via
synchronization. However, they cannot guarantee that no synchronization
takes place at all. The current definition of nofree makes this

    [16 lines not shown]
DeltaFile
+26-15llvm/docs/LangRef.rst
+26-151 files

LLVM/project 8059bc5clang/lib/AST/ByteCode InterpBuiltin.cpp Compiler.cpp, clang/test/AST/ByteCode cxx11.cpp

[clang][bytecode] Fix a diagnostic difference in bitcasts (#197174)

Don't immediately return failure and let it be handled by later checks.
DeltaFile
+8-1clang/lib/AST/ByteCode/InterpBuiltin.cpp
+2-3clang/test/AST/ByteCode/cxx11.cpp
+3-1clang/lib/AST/ByteCode/Compiler.cpp
+0-1clang/lib/AST/ByteCode/Interp.cpp
+13-64 files

LLVM/project 690a25fclang/lib/CodeGen CGBuiltin.cpp, clang/test/CodeGen/AArch64 mskernel-interlocked.c

[clang] Don't optimize out no-op atomics in kernel mode (#193562)

The no-op atomics like InterlockedAnd(addr, (UINT32)-1) don't modify
the underlying value, however kernel code depends on these accesses
to touch the pool page virtual address and intentionally trigger a page
fault during page migration. This patch also fixes an LLVM issue where
idempotent volatile atomics were incorrectly lowered into memory fences.
DeltaFile
+13-0llvm/test/CodeGen/X86/volatile-atomicrmw.ll
+11-0clang/test/CodeGen/AArch64/mskernel-interlocked.c
+10-0clang/test/CodeGen/X86/mskernel-interlocked.c
+3-0clang/lib/CodeGen/CGBuiltin.cpp
+2-0llvm/lib/CodeGen/AtomicExpandPass.cpp
+39-05 files

LLVM/project c55c177llvm/lib/Transforms/InstCombine InstCombineAndOrXor.cpp, llvm/test/Transforms/InstCombine or-bitmask.ll

[InstCombine] Fix type mismatch in `foldBitmaskMul`
DeltaFile
+42-0llvm/test/Transforms/InstCombine/or-bitmask.ll
+6-2llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+48-22 files

LLVM/project 2539d07clang/lib/Sema SemaAMDGPU.cpp

add braces to long statements
DeltaFile
+6-3clang/lib/Sema/SemaAMDGPU.cpp
+6-31 files

LLVM/project a990851clang/docs ReleaseNotes.rst, clang/lib/Sema SemaTemplateInstantiateDecl.cpp

[clang] Validate return type when instantiating `returns_nonnull` attribute
DeltaFile
+21-0clang/test/CodeGenCXX/returns-nonnull-nonptr-crash.cpp
+11-0clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+1-0clang/docs/ReleaseNotes.rst
+33-03 files

LLVM/project 9cd3c0bclang/lib/CodeGen CGHLSLRuntime.cpp CGExpr.cpp, clang/test/CodeGenHLSL/resources resources-in-structs-array-to-local.hlsl res-array-global-to-local.hlsl

[HLSL] Codegen for handling global resource array initialization (#198891)

When a global resource array is accessed - whether it is declared at a
global scope or as part of a global struct instance - all of its
resource elements should be initialized from binding into a temporary
local resource array. This change intercepts the Clang codegen at the
relevant places to allow `CGHLSLRuntime` handle this special global
resource array initialization.

Fixes #187087
Fixes #198888
DeltaFile
+162-0clang/test/CodeGenHLSL/resources/resources-in-structs-array-to-local.hlsl
+157-0clang/test/CodeGenHLSL/resources/res-array-global-to-local.hlsl
+76-43clang/lib/CodeGen/CGHLSLRuntime.cpp
+17-2clang/lib/CodeGen/CGExpr.cpp
+13-4clang/test/CodeGenHLSL/resources/res-array-local2.hlsl
+12-1clang/lib/CodeGen/CGHLSLRuntime.h
+437-502 files not shown
+445-568 files

LLVM/project 48a1ee7llvm/lib/Target/AMDGPU SIInstrInfo.cpp SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU wait-xcnt-drain.mir wait-xcnt.mir

[AMDGPU] Remove redundant s_wait_xcnt after implicit XCNT drains (#198823)

On gfx1250 several instructions implicitly drain XCNT in hardware:
`s_barrier_wait`/`signal`/`signal_isfirst`, `s_sendmsg`, PC-changes etc.
This patch will remove redundant `s_wait_xcnt` after implicit XCNT
drains.

Pre-commit tests on #198772
Fix: LCOMPILER-1665
DeltaFile
+29-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+0-19llvm/test/CodeGen/AMDGPU/wait-xcnt-drain.mir
+5-0llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+2-1llvm/test/CodeGen/AMDGPU/wait-xcnt.mir
+0-2llvm/test/CodeGen/AMDGPU/flat-saddr-atomics.ll
+39-226 files

LLVM/project 1da70adclang-tools-extra/clang-tidy/readability RedundantParenthesesCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Fix false positive of parentheses removal for overloaded operator (#192254)

Fixes #189217
    
don't remove necessary parentheses for an overloaded operator, when
the parenthese occurs in the context of a binary operation
    
E.g. (E1 & E2) != E3       // the brackets aren't redundant here
E.g. (E1 & E2)             // brackets are redundant here
DeltaFile
+87-0clang-tools-extra/test/clang-tidy/checkers/readability/redundant-parentheses.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+1-1clang-tools-extra/clang-tidy/readability/RedundantParenthesesCheck.cpp
+93-13 files

LLVM/project 846bfd7clang/include/clang/Analysis/Analyses/LifetimeSafety FactsGenerator.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp

[LifetimeSafety] Avoid assert on variadic placement new (#199588)

Avoid assuming that a placement allocation function has a second
`ParmVarDecl` before checking whether that parameter is `void*`.
Variadic `operator new(size_t, ...)` can have a placement argument
matched by the ellipsis instead.

As of AI Usage: Codex is used to help rephrase part of the new comments.

Closes https://github.com/llvm/llvm-project/issues/199584
DeltaFile
+37-27clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+11-0clang/test/Sema/warn-lifetime-safety.cpp
+2-0clang/include/clang/Analysis/Analyses/LifetimeSafety/FactsGenerator.h
+50-273 files

LLVM/project e710121llvm/test lit.cfg.py, llvm/utils/lit/lit TestRunner.py cl_arguments.py

refactor to move function-selection from lit core into llvm/test config
DeltaFile
+23-0llvm/utils/lit/lit/llvm/fn_selection.py
+0-19llvm/utils/lit/lit/TestRunner.py
+0-17llvm/utils/lit/lit/cl_arguments.py
+7-6llvm/utils/lit/tests/fn-selection.py
+4-0llvm/test/lit.cfg.py
+3-0llvm/utils/lit/tests/Inputs/fn-selection/lit.cfg
+37-422 files not shown
+37-458 files

LLVM/project 15d1a5dllvm/test lit.cfg.py, llvm/utils/lit/lit TestRunner.py cl_arguments.py

refactor to move function-selection from lit core into llvm/test config
DeltaFile
+23-0llvm/utils/lit/lit/llvm/fn_selection.py
+0-18llvm/utils/lit/lit/TestRunner.py
+0-17llvm/utils/lit/lit/cl_arguments.py
+7-6llvm/utils/lit/tests/fn-selection.py
+4-0llvm/test/lit.cfg.py
+3-0llvm/utils/lit/tests/Inputs/fn-selection/lit.cfg
+37-412 files not shown
+37-448 files

LLVM/project 5acb952clang/lib/Driver/ToolChains Clang.cpp, clang/test/CodeGenHIP profile-coverage-mapping.hip

[HIP][Driver] Forward -fcoverage-mapping flags to device compiler (#198872)

Add `-fcoverage-mapping`, `-fno-coverage-mapping`,
`-fcoverage-compilation-dir=`, `-ffile-compilation-dir=`, and
`-fcoverage-prefix-map=` to the LinkerWrapper `CompilerOptions`
forwarding list. Without this, passing `-fprofile-instr-generate
-fcoverage-mapping` to clang for a HIP program silently omits the
coverage mapping flags from the embedded device recompilation, so
`__llvm_covmap`/`__llvm_covfun` sections are never emitted for device
code.
DeltaFile
+48-15clang/lib/Driver/ToolChains/Clang.cpp
+25-0clang/test/Driver/hip-options.hip
+22-0clang/test/CodeGenHIP/profile-coverage-mapping.hip
+95-153 files

LLVM/project da4894cllvm/lib/Transforms/Scalar LoopFuse.cpp, llvm/test/Transforms/LoopFusion pr191238.ll loop_invariant.ll

[LoopFusion] reject unsafe scalar flow dependences (#195895)

`loop-fusion` treats any loop-invariant scalar non-anti dependence as
safe to fuse. In the linked issue, it incorrectly allows scalar flow
dependences where the first loop writes a loop-invariant location and
the second loop later reads that same location. Fusion interleaves the
producer and consumer and this changes the value observed by the second
loop.

Example C source would look like:
```C
for (int i = 0; i < N; i++) {
    ptr[0] = i;
}
for (int j = 0; j < N; j++) {
    out[j] = ptr[0];
}
=>
for (int i = 0; i < N; i++) {

    [14 lines not shown]
DeltaFile
+57-0llvm/test/Transforms/LoopFusion/pr191238.ll
+11-5llvm/lib/Transforms/Scalar/LoopFuse.cpp
+1-1llvm/test/Transforms/LoopFusion/loop_invariant.ll
+69-63 files

LLVM/project a20e85fclang/test/CXX/basic/basic.link p11.cpp

[clang] NFC: add test cases from #111561 (#200105)

This adds those test cases while #111561 gathers dust.
DeltaFile
+41-0clang/test/CXX/basic/basic.link/p11.cpp
+41-01 files

LLVM/project d5e97d7clang/lib/Driver/ToolChains CommonArgs.cpp, clang/test/Driver split-debug.c

[Driver] Honor /Fo when deriving the split-dwarf .dwo path (#199613)

SplitDebugName checked -o and /o but not /Fo, so clang-cl /Fo<path> /c
fell through to the cwd-relative fallback and every .dwo landed in cwd
under <source-stem>.dwo regardless of the .obj location.
DeltaFile
+7-5clang/lib/Driver/ToolChains/CommonArgs.cpp
+8-0clang/test/Driver/split-debug.c
+15-52 files

LLVM/project f94bb4bllvm/test/CodeGen/AMDGPU accvgpr-spill-scc-clobber.mir pei-build-av-spill.mir, mlir/lib/Dialect/XeGPU/Transforms XeGPUSubgroupDistribute.cpp

Merge branch 'upstream-main' into users/ssahasra/refactor-sema-helpers
DeltaFile
+5,568-0llvm/test/CodeGen/AMDGPU/accvgpr-spill-scc-clobber.mir
+3,000-96llvm/test/CodeGen/AMDGPU/pei-build-av-spill.mir
+3,075-0llvm/test/CodeGen/AMDGPU/debug-frame.ll
+0-2,280mlir/lib/Dialect/XeGPU/Transforms/XeGPUSubgroupDistribute.cpp
+2,208-72llvm/test/CodeGen/AMDGPU/pei-build-spill.mir
+2,196-0llvm/test/CodeGen/AMDGPU/eliminate-frame-index-s-mov-b32.mir
+16,047-2,4482,596 files not shown
+91,848-39,3292,602 files

LLVM/project 635e120compiler-rt/lib/profile InstrProfilingFile.c InstrProfilingPlatformROCm.cpp

[PGO][HIP] Stop pulling ROCm.o into every PGO host link (#200101)

PR #177665 added an unconditional `extern` reference to
`__llvm_profile_hip_collect_device_data` from `InstrProfilingFile.c`,
which forces `InstrProfilingPlatformROCm.o` (and its sanitizer_common /
interception dependencies) out of `libclang_rt.profile.a` in every PGO
binary. That breaks bots without `-lpthread` and races dlsym/PLT state
in non-HIP programs via the interceptor constructor.

Fix:
- Declare the hook `COMPILER_RT_WEAK` and gate the call on its address.
No `COMPILER_RT_VISIBILITY`: a hidden weak-undef function would be
non-preemptible and the address test would fold to true.
- Gate `installHipModuleInterceptors` on `dlsym(hipModuleLoad)` so the
constructor is a no-op if `ROCm.o` is still pulled in.

Fixes:
- https://lab.llvm.org/buildbot/#/builders/66/builds/31311
- https://lab.llvm.org/buildbot/#/builders/174/builds/36180

    [7 lines not shown]
DeltaFile
+22-5compiler-rt/lib/profile/InstrProfilingFile.c
+7-0compiler-rt/lib/profile/InstrProfilingPlatformROCm.cpp
+29-52 files

LLVM/project ec2fc4aclang/test/CXX/basic/basic.link p11.cpp

[clang] NFC: add test cases from #111561

This adds those test cases while the PR gathers dust.
DeltaFile
+41-0clang/test/CXX/basic/basic.link/p11.cpp
+41-01 files

LLVM/project 2d5dac5utils/bazel/llvm-project-overlay/compiler-rt BUILD.bazel, utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[Bazel] Fixes 5db1364 (#200104)

The commit added a dep from profile -> interception, so define that
target too

Fixes 5db13643f4b7038db0ca304d9f8900122502935c
DeltaFile
+37-0utils/bazel/llvm-project-overlay/compiler-rt/BUILD.bazel
+1-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+38-02 files

LLVM/project f918545clang/include/clang/Sema SemaObjC.h Sema.h, clang/lib/Sema SemaExprObjC.cpp SemaExpr.cpp

[clang][AMDGPU] Fix -ast-print crash on expanded predicate builtins (#199963)

ExpandAMDGPUPredicateBuiltIn synthesized an IntegerLiteral typed
_Bool/bool — a shape no other producer creates, and one that
StmtPrinter::VisitIntegerLiteral has no case for. -ast-print on the
resulting if-condition hit llvm_unreachable.

Emit the canonical boolean literal instead:

- C++, C23, OpenCL, HIP: CXXBoolLiteralExpr 'bool'
- pre-C23 C: IntegerLiteral 'int'

In the C case this matches what <stdbool.h>'s true/false macros expand
to.

Fixes #199563
DeltaFile
+88-0clang/test/AST/ast-print-amdgcn-predicate.c
+0-18clang/lib/Sema/SemaExprObjC.cpp
+14-0clang/lib/Sema/SemaExpr.cpp
+2-6clang/lib/Sema/SemaAMDGPU.cpp
+4-1clang/include/clang/Sema/SemaObjC.h
+3-0clang/include/clang/Sema/Sema.h
+111-256 files

LLVM/project c80e0a8clang/include/clang/AST DeclTemplate.h, clang/lib/AST DeclTemplate.cpp

[clang] fix getTemplateInstantiationArgs

This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.

This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.

Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.

Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
DeltaFile
+197-433clang/lib/Sema/SemaTemplateInstantiate.cpp
+257-164clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+161-161clang/lib/Sema/SemaTemplate.cpp
+100-99clang/include/clang/AST/DeclTemplate.h
+59-129clang/lib/Sema/SemaConcept.cpp
+60-92clang/lib/AST/DeclTemplate.cpp
+834-1,07852 files not shown
+1,496-1,74258 files

LLVM/project fc66de1llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp LegalizeTypes.h, llvm/test/CodeGen/X86 atomic-load-store.ll

[SelectionDAG] Widen <2 x T> vector types for atomic store

Vector types of 2 elements must be widened. This change does this
for vector types of atomic store in SelectionDAG so that it can
translate aligned vectors of >1 size.
DeltaFile
+198-0llvm/test/CodeGen/X86/atomic-load-store.ll
+50-0llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+1-0llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+249-03 files

LLVM/project 6619c82llvm/lib/Target/RISCV RISCVMoveMerger.cpp, llvm/test/CodeGen/RISCV move-merge-zdinx-mvsa-regression.mir

[RISCV] Fix incorrect CM.MVSA01/QC_CM_MVSA01 generation with Zdinx (#200000)

The `RISCVMoveMerger` pass was incorrectly forming
`CM_MVSA01/QC_CM_MVSA01` when `Zdinx` was enabled. The pass attempted CM
merge for copy pairs even when the first copy was not an `a0/a1-based`
CM candidate.

Fix by only running `findMatchingInst` when the current copy is a valid
CM candidate.
DeltaFile
+28-0llvm/test/CodeGen/RISCV/move-merge-zdinx-mvsa-regression.mir
+3-3llvm/lib/Target/RISCV/RISCVMoveMerger.cpp
+31-32 files

LLVM/project 5b17cdbllvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV rvp-simd-64.ll rvp-narrowing-shift-trunc.ll

[RISCV][P-ext] Split v4i16/v8i8 INSERT/EXTRACT_VECTOR_ELT on RV32. (#199917)

With a constant lane index, split the vector and recurse on the
single-GPR half containing Idx (already Custom-lowered).
DeltaFile
+315-410llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+32-0llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+2-10llvm/test/CodeGen/RISCV/rvp-narrowing-shift-trunc.ll
+349-4203 files

LLVM/project 0381a09llvm/test/Transforms/SLPVectorizer/RISCV runtime-strided-stores.ll

[SLP] Precommit tests for runtime strided stores (#200019)

Accompanies #200018
DeltaFile
+995-0llvm/test/Transforms/SLPVectorizer/RISCV/runtime-strided-stores.ll
+995-01 files

LLVM/project dbe6800llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt gfx12_dasm_vop3.txt

[AMDGPU] This reverts patches to use fp16 inline constants for i16 (#200091)

Patches reverted:

commit c315c662cd2d33e0c7f962fed742ee53626d8005
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date:   Wed May 27 12:51:13 2026

    [AMDGPU] Fix codesize estimate after #198005 (#200033)

    This fixes failure in libc tests which checks the exact encoding
    size. Encoding is now shorter, but it did not recognize fp16
    immediates as an inlinable constant and assumes literal encoding.

    Shorter encodings were created here:
    https://github.com/llvm/llvm-project/pull/198005

commit 2b3bc03b5ef00e7eaa245420ca981c700e1c05c4
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>

    [15 lines not shown]
DeltaFile
+228-228llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+200-200llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3.txt
+200-200llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3.txt
+194-194llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3.txt
+144-144llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3c.txt
+128-128llvm/test/MC/Disassembler/AMDGPU/gfx8_vop3cx.txt
+1,094-1,09481 files not shown
+3,882-3,67287 files

LLVM/project d658972clang/lib/StaticAnalyzer/Checkers/WebKit RawPtrRefLocalVarsChecker.cpp, clang/test/Analysis/Checkers/WebKit uncounted-local-vars.cpp unretained-local-vars.mm

[alpha.webkit.UncountedLocalVarsChecker] Detect an assignment to a guardian argument (#198695)

A function parameter of type RefPtr<T>& should not be used as a guardian
variable of a raw pointer/reference variable if the function body
contains an assignment to it since such an assignment can shorten the
lifetime of the guarded object.
DeltaFile
+17-6clang/lib/StaticAnalyzer/Checkers/WebKit/RawPtrRefLocalVarsChecker.cpp
+13-0clang/test/Analysis/Checkers/WebKit/uncounted-local-vars.cpp
+7-0clang/test/Analysis/Checkers/WebKit/unretained-local-vars.mm
+37-63 files