LLVM/project a990851clang/docs ReleaseNotes.rst, clang/lib/Sema SemaTemplateInstantiateDecl.cpp

[clang] Validate return type when instantiating `returns_nonnull` attribute
DeltaFile
+21-0clang/test/CodeGenCXX/returns-nonnull-nonptr-crash.cpp
+11-0clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+1-0clang/docs/ReleaseNotes.rst
+33-03 files

LLVM/project 9cd3c0bclang/lib/CodeGen CGHLSLRuntime.cpp CGExpr.cpp, clang/test/CodeGenHLSL/resources resources-in-structs-array-to-local.hlsl res-array-global-to-local.hlsl

[HLSL] Codegen for handling global resource array initialization (#198891)

When a global resource array is accessed - whether it is declared at a
global scope or as part of a global struct instance - all of its
resource elements should be initialized from binding into a temporary
local resource array. This change intercepts the Clang codegen at the
relevant places to allow `CGHLSLRuntime` handle this special global
resource array initialization.

Fixes #187087
Fixes #198888
DeltaFile
+162-0clang/test/CodeGenHLSL/resources/resources-in-structs-array-to-local.hlsl
+157-0clang/test/CodeGenHLSL/resources/res-array-global-to-local.hlsl
+76-43clang/lib/CodeGen/CGHLSLRuntime.cpp
+17-2clang/lib/CodeGen/CGExpr.cpp
+13-4clang/test/CodeGenHLSL/resources/res-array-local2.hlsl
+12-1clang/lib/CodeGen/CGHLSLRuntime.h
+437-502 files not shown
+445-568 files

LLVM/project 48a1ee7llvm/lib/Target/AMDGPU SIInstrInfo.cpp SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU wait-xcnt-drain.mir wait-xcnt.mir

[AMDGPU] Remove redundant s_wait_xcnt after implicit XCNT drains (#198823)

On gfx1250 several instructions implicitly drain XCNT in hardware:
`s_barrier_wait`/`signal`/`signal_isfirst`, `s_sendmsg`, PC-changes etc.
This patch will remove redundant `s_wait_xcnt` after implicit XCNT
drains.

Pre-commit tests on #198772
Fix: LCOMPILER-1665
DeltaFile
+29-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+0-19llvm/test/CodeGen/AMDGPU/wait-xcnt-drain.mir
+5-0llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+2-1llvm/test/CodeGen/AMDGPU/wait-xcnt.mir
+0-2llvm/test/CodeGen/AMDGPU/flat-saddr-atomics.ll
+39-226 files

LLVM/project 1da70adclang-tools-extra/clang-tidy/readability RedundantParenthesesCheck.cpp, clang-tools-extra/docs ReleaseNotes.rst

[clang-tidy] Fix false positive of parentheses removal for overloaded operator (#192254)

Fixes #189217
    
don't remove necessary parentheses for an overloaded operator, when
the parenthese occurs in the context of a binary operation
    
E.g. (E1 & E2) != E3       // the brackets aren't redundant here
E.g. (E1 & E2)             // brackets are redundant here
DeltaFile
+87-0clang-tools-extra/test/clang-tidy/checkers/readability/redundant-parentheses.cpp
+5-0clang-tools-extra/docs/ReleaseNotes.rst
+1-1clang-tools-extra/clang-tidy/readability/RedundantParenthesesCheck.cpp
+93-13 files

LLVM/project 846bfd7clang/include/clang/Analysis/Analyses/LifetimeSafety FactsGenerator.h, clang/lib/Analysis/LifetimeSafety FactsGenerator.cpp

[LifetimeSafety] Avoid assert on variadic placement new (#199588)

Avoid assuming that a placement allocation function has a second
`ParmVarDecl` before checking whether that parameter is `void*`.
Variadic `operator new(size_t, ...)` can have a placement argument
matched by the ellipsis instead.

As of AI Usage: Codex is used to help rephrase part of the new comments.

Closes https://github.com/llvm/llvm-project/issues/199584
DeltaFile
+37-27clang/lib/Analysis/LifetimeSafety/FactsGenerator.cpp
+11-0clang/test/Sema/warn-lifetime-safety.cpp
+2-0clang/include/clang/Analysis/Analyses/LifetimeSafety/FactsGenerator.h
+50-273 files

LLVM/project e710121llvm/test lit.cfg.py, llvm/utils/lit/lit TestRunner.py cl_arguments.py

refactor to move function-selection from lit core into llvm/test config
DeltaFile
+23-0llvm/utils/lit/lit/llvm/fn_selection.py
+0-19llvm/utils/lit/lit/TestRunner.py
+0-17llvm/utils/lit/lit/cl_arguments.py
+7-6llvm/utils/lit/tests/fn-selection.py
+4-0llvm/test/lit.cfg.py
+3-0llvm/utils/lit/tests/Inputs/fn-selection/lit.cfg
+37-422 files not shown
+37-458 files

LLVM/project 15d1a5dllvm/test lit.cfg.py, llvm/utils/lit/lit TestRunner.py cl_arguments.py

refactor to move function-selection from lit core into llvm/test config
DeltaFile
+23-0llvm/utils/lit/lit/llvm/fn_selection.py
+0-18llvm/utils/lit/lit/TestRunner.py
+0-17llvm/utils/lit/lit/cl_arguments.py
+7-6llvm/utils/lit/tests/fn-selection.py
+4-0llvm/test/lit.cfg.py
+3-0llvm/utils/lit/tests/Inputs/fn-selection/lit.cfg
+37-412 files not shown
+37-448 files

LLVM/project 5acb952clang/lib/Driver/ToolChains Clang.cpp, clang/test/CodeGenHIP profile-coverage-mapping.hip

[HIP][Driver] Forward -fcoverage-mapping flags to device compiler (#198872)

Add `-fcoverage-mapping`, `-fno-coverage-mapping`,
`-fcoverage-compilation-dir=`, `-ffile-compilation-dir=`, and
`-fcoverage-prefix-map=` to the LinkerWrapper `CompilerOptions`
forwarding list. Without this, passing `-fprofile-instr-generate
-fcoverage-mapping` to clang for a HIP program silently omits the
coverage mapping flags from the embedded device recompilation, so
`__llvm_covmap`/`__llvm_covfun` sections are never emitted for device
code.
DeltaFile
+48-15clang/lib/Driver/ToolChains/Clang.cpp
+25-0clang/test/Driver/hip-options.hip
+22-0clang/test/CodeGenHIP/profile-coverage-mapping.hip
+95-153 files

LLVM/project da4894cllvm/lib/Transforms/Scalar LoopFuse.cpp, llvm/test/Transforms/LoopFusion pr191238.ll loop_invariant.ll

[LoopFusion] reject unsafe scalar flow dependences (#195895)

`loop-fusion` treats any loop-invariant scalar non-anti dependence as
safe to fuse. In the linked issue, it incorrectly allows scalar flow
dependences where the first loop writes a loop-invariant location and
the second loop later reads that same location. Fusion interleaves the
producer and consumer and this changes the value observed by the second
loop.

Example C source would look like:
```C
for (int i = 0; i < N; i++) {
    ptr[0] = i;
}
for (int j = 0; j < N; j++) {
    out[j] = ptr[0];
}
=>
for (int i = 0; i < N; i++) {

    [14 lines not shown]
DeltaFile
+57-0llvm/test/Transforms/LoopFusion/pr191238.ll
+11-5llvm/lib/Transforms/Scalar/LoopFuse.cpp
+1-1llvm/test/Transforms/LoopFusion/loop_invariant.ll
+69-63 files

LLVM/project a20e85fclang/test/CXX/basic/basic.link p11.cpp

[clang] NFC: add test cases from #111561 (#200105)

This adds those test cases while #111561 gathers dust.
DeltaFile
+41-0clang/test/CXX/basic/basic.link/p11.cpp
+41-01 files

LLVM/project d5e97d7clang/lib/Driver/ToolChains CommonArgs.cpp, clang/test/Driver split-debug.c

[Driver] Honor /Fo when deriving the split-dwarf .dwo path (#199613)

SplitDebugName checked -o and /o but not /Fo, so clang-cl /Fo<path> /c
fell through to the cwd-relative fallback and every .dwo landed in cwd
under <source-stem>.dwo regardless of the .obj location.
DeltaFile
+7-5clang/lib/Driver/ToolChains/CommonArgs.cpp
+8-0clang/test/Driver/split-debug.c
+15-52 files

LLVM/project 635e120compiler-rt/lib/profile InstrProfilingFile.c InstrProfilingPlatformROCm.cpp

[PGO][HIP] Stop pulling ROCm.o into every PGO host link (#200101)

PR #177665 added an unconditional `extern` reference to
`__llvm_profile_hip_collect_device_data` from `InstrProfilingFile.c`,
which forces `InstrProfilingPlatformROCm.o` (and its sanitizer_common /
interception dependencies) out of `libclang_rt.profile.a` in every PGO
binary. That breaks bots without `-lpthread` and races dlsym/PLT state
in non-HIP programs via the interceptor constructor.

Fix:
- Declare the hook `COMPILER_RT_WEAK` and gate the call on its address.
No `COMPILER_RT_VISIBILITY`: a hidden weak-undef function would be
non-preemptible and the address test would fold to true.
- Gate `installHipModuleInterceptors` on `dlsym(hipModuleLoad)` so the
constructor is a no-op if `ROCm.o` is still pulled in.

Fixes:
- https://lab.llvm.org/buildbot/#/builders/66/builds/31311
- https://lab.llvm.org/buildbot/#/builders/174/builds/36180

    [7 lines not shown]
DeltaFile
+22-5compiler-rt/lib/profile/InstrProfilingFile.c
+7-0compiler-rt/lib/profile/InstrProfilingPlatformROCm.cpp
+29-52 files

LLVM/project ec2fc4aclang/test/CXX/basic/basic.link p11.cpp

[clang] NFC: add test cases from #111561

This adds those test cases while the PR gathers dust.
DeltaFile
+41-0clang/test/CXX/basic/basic.link/p11.cpp
+41-01 files

LLVM/project 2d5dac5utils/bazel/llvm-project-overlay/compiler-rt BUILD.bazel, utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[Bazel] Fixes 5db1364 (#200104)

The commit added a dep from profile -> interception, so define that
target too

Fixes 5db13643f4b7038db0ca304d9f8900122502935c
DeltaFile
+37-0utils/bazel/llvm-project-overlay/compiler-rt/BUILD.bazel
+1-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+38-02 files

LLVM/project f918545clang/include/clang/Sema SemaObjC.h Sema.h, clang/lib/Sema SemaExprObjC.cpp SemaExpr.cpp

[clang][AMDGPU] Fix -ast-print crash on expanded predicate builtins (#199963)

ExpandAMDGPUPredicateBuiltIn synthesized an IntegerLiteral typed
_Bool/bool — a shape no other producer creates, and one that
StmtPrinter::VisitIntegerLiteral has no case for. -ast-print on the
resulting if-condition hit llvm_unreachable.

Emit the canonical boolean literal instead:

- C++, C23, OpenCL, HIP: CXXBoolLiteralExpr 'bool'
- pre-C23 C: IntegerLiteral 'int'

In the C case this matches what <stdbool.h>'s true/false macros expand
to.

Fixes #199563
DeltaFile
+88-0clang/test/AST/ast-print-amdgcn-predicate.c
+0-18clang/lib/Sema/SemaExprObjC.cpp
+14-0clang/lib/Sema/SemaExpr.cpp
+2-6clang/lib/Sema/SemaAMDGPU.cpp
+4-1clang/include/clang/Sema/SemaObjC.h
+3-0clang/include/clang/Sema/Sema.h
+111-256 files

LLVM/project c80e0a8clang/include/clang/AST DeclTemplate.h, clang/lib/AST DeclTemplate.cpp

[clang] fix getTemplateInstantiationArgs

This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.

This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.

Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.

Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
DeltaFile
+197-433clang/lib/Sema/SemaTemplateInstantiate.cpp
+257-164clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+161-161clang/lib/Sema/SemaTemplate.cpp
+100-99clang/include/clang/AST/DeclTemplate.h
+59-129clang/lib/Sema/SemaConcept.cpp
+60-92clang/lib/AST/DeclTemplate.cpp
+834-1,07852 files not shown
+1,496-1,74258 files

LLVM/project fc66de1llvm/lib/CodeGen/SelectionDAG LegalizeVectorTypes.cpp LegalizeTypes.h, llvm/test/CodeGen/X86 atomic-load-store.ll

[SelectionDAG] Widen <2 x T> vector types for atomic store

Vector types of 2 elements must be widened. This change does this
for vector types of atomic store in SelectionDAG so that it can
translate aligned vectors of >1 size.
DeltaFile
+198-0llvm/test/CodeGen/X86/atomic-load-store.ll
+50-0llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
+1-0llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h
+249-03 files

LLVM/project 6619c82llvm/lib/Target/RISCV RISCVMoveMerger.cpp, llvm/test/CodeGen/RISCV move-merge-zdinx-mvsa-regression.mir

[RISCV] Fix incorrect CM.MVSA01/QC_CM_MVSA01 generation with Zdinx (#200000)

The `RISCVMoveMerger` pass was incorrectly forming
`CM_MVSA01/QC_CM_MVSA01` when `Zdinx` was enabled. The pass attempted CM
merge for copy pairs even when the first copy was not an `a0/a1-based`
CM candidate.

Fix by only running `findMatchingInst` when the current copy is a valid
CM candidate.
DeltaFile
+28-0llvm/test/CodeGen/RISCV/move-merge-zdinx-mvsa-regression.mir
+3-3llvm/lib/Target/RISCV/RISCVMoveMerger.cpp
+31-32 files

LLVM/project 5b17cdbllvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV rvp-simd-64.ll rvp-narrowing-shift-trunc.ll

[RISCV][P-ext] Split v4i16/v8i8 INSERT/EXTRACT_VECTOR_ELT on RV32. (#199917)

With a constant lane index, split the vector and recurse on the
single-GPR half containing Idx (already Custom-lowered).
DeltaFile
+315-410llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+32-0llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+2-10llvm/test/CodeGen/RISCV/rvp-narrowing-shift-trunc.ll
+349-4203 files

LLVM/project 0381a09llvm/test/Transforms/SLPVectorizer/RISCV runtime-strided-stores.ll

[SLP] Precommit tests for runtime strided stores (#200019)

Accompanies #200018
DeltaFile
+995-0llvm/test/Transforms/SLPVectorizer/RISCV/runtime-strided-stores.ll
+995-01 files

LLVM/project dbe6800llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt gfx12_dasm_vop3.txt

[AMDGPU] This reverts patches to use fp16 inline constants for i16 (#200091)

Patches reverted:

commit c315c662cd2d33e0c7f962fed742ee53626d8005
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date:   Wed May 27 12:51:13 2026

    [AMDGPU] Fix codesize estimate after #198005 (#200033)

    This fixes failure in libc tests which checks the exact encoding
    size. Encoding is now shorter, but it did not recognize fp16
    immediates as an inlinable constant and assumes literal encoding.

    Shorter encodings were created here:
    https://github.com/llvm/llvm-project/pull/198005

commit 2b3bc03b5ef00e7eaa245420ca981c700e1c05c4
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>

    [15 lines not shown]
DeltaFile
+228-228llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+200-200llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3.txt
+200-200llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3.txt
+194-194llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3.txt
+144-144llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3c.txt
+128-128llvm/test/MC/Disassembler/AMDGPU/gfx8_vop3cx.txt
+1,094-1,09481 files not shown
+3,882-3,67287 files

LLVM/project d658972clang/lib/StaticAnalyzer/Checkers/WebKit RawPtrRefLocalVarsChecker.cpp, clang/test/Analysis/Checkers/WebKit uncounted-local-vars.cpp unretained-local-vars.mm

[alpha.webkit.UncountedLocalVarsChecker] Detect an assignment to a guardian argument (#198695)

A function parameter of type RefPtr<T>& should not be used as a guardian
variable of a raw pointer/reference variable if the function body
contains an assignment to it since such an assignment can shorten the
lifetime of the guarded object.
DeltaFile
+17-6clang/lib/StaticAnalyzer/Checkers/WebKit/RawPtrRefLocalVarsChecker.cpp
+13-0clang/test/Analysis/Checkers/WebKit/uncounted-local-vars.cpp
+7-0clang/test/Analysis/Checkers/WebKit/unretained-local-vars.mm
+37-63 files

LLVM/project 740e52bllvm/lib/Target/DirectX/DirectXIRPasses DXILDebugInfo.cpp, llvm/test/tools/dxil-dis di-label.ll

[DirectX] Drop debug labels (#197490)

Debug labels did not exist in LLVM 3.7 and have no equivalent.
DeltaFile
+36-0llvm/test/tools/dxil-dis/di-label.ll
+25-0llvm/lib/Target/DirectX/DirectXIRPasses/DXILDebugInfo.cpp
+61-02 files

LLVM/project 830c8d6llvm/test/tools/llvm-symbolizer wasm-basic.s lit.local.cfg, llvm/test/tools/llvm-symbolizer/Inputs wasm-basic.yaml

test(llvm-symbolizer): fix Wasm layering violation by using YAML (#200080)

Avoid using wasm-ld in LLVM tests by prebuilding the test binary
as a YAML file and using yaml2obj at test time.

This matches the approach taken in
4bce216e6b550c770f2e536422c3d95333f65ba3.
Because yaml2obj always uses 5-byte LEBs, the CODE section offset
shifted from 0x37 to 0x4b, so the file offsets passed to llvm-symbolizer
were updated accordingly.

Replaces #200046

Assisted-by: Gemini
DeltaFile
+61-0llvm/test/tools/llvm-symbolizer/Inputs/wasm-basic.yaml
+6-6llvm/test/tools/llvm-symbolizer/wasm-basic.s
+0-4llvm/test/tools/llvm-symbolizer/lit.local.cfg
+67-103 files

LLVM/project 9d50a39llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt gfx11_dasm_vop3.txt

[AMDGPU] This reverts patches to use fp16 inline constants for i16

Patches reverted:

commit c315c662cd2d33e0c7f962fed742ee53626d8005
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>
Date:   Wed May 27 12:51:13 2026

    [AMDGPU] Fix codesize estimate after #198005 (#200033)

    This fixes failure in libc tests which checks the exact encoding
    size. Encoding is now shorter, but it did not recognize fp16
    immediates as an inlinable constant and assumes literal encoding.

    Shorter encodings were created here:
    https://github.com/llvm/llvm-project/pull/198005

commit 2b3bc03b5ef00e7eaa245420ca981c700e1c05c4
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin at amd.com>

    [16 lines not shown]
DeltaFile
+228-228llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+200-200llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3.txt
+200-200llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vop3.txt
+194-194llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3.txt
+144-144llvm/test/MC/Disassembler/AMDGPU/gfx10_vop3c.txt
+128-128llvm/test/MC/Disassembler/AMDGPU/gfx8_vop3c.txt
+1,094-1,09481 files not shown
+3,882-3,67287 files

LLVM/project 20e1831clang/include/clang/AST DeclTemplate.h, clang/lib/Sema SemaTemplate.cpp SemaDecl.cpp

[clang] fix member specializations of class and variable partial specializations

A partial specialization may be a member specialization even if it is not
an instantiation of a member partial specialization.

For example:
```C++
template<class> struct X {
  template<class> struct Inner;
};
template<> template<class T>
struct X<int>::Inner<T*> {};
```

Make sure this state is represented, so that [temp.spec.partial.member]p2
can be applied.

Split off from #199528
DeltaFile
+12-16clang/lib/Sema/SemaTemplate.cpp
+9-6clang/lib/Sema/SemaDecl.cpp
+12-0clang/test/SemaTemplate/class-template-spec.cpp
+4-6clang/test/CXX/temp/temp.decls/temp.spec.partial/temp.spec.partial.member/p2.cpp
+4-4clang/include/clang/AST/DeclTemplate.h
+3-4clang/lib/Sema/SemaTemplateInstantiate.cpp
+44-362 files not shown
+47-408 files

LLVM/project 4ccab64clang/include/clang/AST DeclTemplate.h, clang/lib/AST DeclTemplate.cpp

[clang] fix getTemplateInstantiationArgs

This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.

This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.

Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.

Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
DeltaFile
+197-433clang/lib/Sema/SemaTemplateInstantiate.cpp
+257-164clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+161-161clang/lib/Sema/SemaTemplate.cpp
+100-99clang/include/clang/AST/DeclTemplate.h
+59-129clang/lib/Sema/SemaConcept.cpp
+60-92clang/lib/AST/DeclTemplate.cpp
+834-1,07851 files not shown
+1,488-1,74257 files

LLVM/project 1a5822ellvm/test/Transforms/Coroutines coro-await-suspend-handle-in-ramp.ll coro-split-sink-lifetime-01.ll

[coro] Use C calling convention for C++20 coroutines (#198943)

Change the calling convention for resume / destroy functions of C++
coroutines from `fastcc` to the C calling convention.

The resume / destroy functions are exposed as part of the coroutine ABI
and must be compatible with other compilers and other versions of LLVM.
fastcc is an LLVM-internal, unstable calling convention, though.

In practice, fastcc and the C calling convention are in sync for `void
func(void*)` function signatures on almost all platforms. Therefore, I
think we can still do this change without widespread ABI breakage.

`fastcc` and `ccc` do differ for i686 (x86-32), MIPS O32, PowerPC64
ELFv1 and Lanai. Afaik, those are all legacy ABIs and a recent feature
like C++20 coroutines is unlikely to be used by projects still targeting
legacy ABIs.

Historical context: I tried to figure out why `fastcc` was used. It is

    [6 lines not shown]
DeltaFile
+9-7llvm/test/Transforms/Coroutines/coro-await-suspend-handle-in-ramp.ll
+6-6llvm/test/Transforms/Coroutines/coro-split-sink-lifetime-01.ll
+5-5llvm/test/Transforms/Coroutines/coro-split-musttail1.ll
+5-5llvm/test/Transforms/Coroutines/coro-split-musttail-chain-pgo-counter-promo.ll
+5-5llvm/test/Transforms/Coroutines/coro-split-musttail3.ll
+4-4llvm/test/Transforms/Coroutines/coro-await-suspend-lower-invoke.ll
+34-3249 files not shown
+147-13255 files

LLVM/project 9cc6c93libclc/clc/include/clc/relational clc_signbit.h, libclc/clc/lib/generic/relational clc_signbit.cl clc_signbit.inc

[libclc] Optimize and vectorize signbit (#199497)

Replace element-wise scalarizing implementation with bitcast and shift.
For example,
define hidden range(i32 -1, 1) <2 x i32> @_Z7signbitDv2_f(<2 x float>
noundef %0) #0 {
  %2 = bitcast <2 x float> %0 to <2 x i32>
  %3 = extractelement <2 x i32> %2, i64 0
  %4 = lshr i32 %3, 31
  %5 = insertelement <2 x i32> poison, i32 %4, i64 0
  %6 = extractelement <2 x i32> %2, i64 1
  %7 = lshr i32 %6, 31
  %8 = insertelement <2 x i32> %5, i32 %7, i64 1
  %9 = icmp ne <2 x i32> %8, zeroinitializer
  %10 = sext <2 x i1> %9 to <2 x i32>
  ret <2 x i32> %10
}
is changed to:
define hidden noundef range(i32 -1, 1) <2 x i32> @_Z7signbitDv2_f(<2 x

    [4 lines not shown]
DeltaFile
+3-85libclc/clc/lib/generic/relational/clc_signbit.cl
+23-0libclc/clc/lib/generic/relational/clc_signbit.inc
+2-0libclc/clc/include/clc/relational/clc_signbit.h
+28-853 files

LLVM/project 1670d39llvm/lib/Target/DirectX/DXILWriter DXILBitcodeWriter.cpp, llvm/lib/Target/DirectX/DirectXIRPasses PointerTypeAnalysis.cpp

[DirectX] Handle undef the same way as null (#197507)
DeltaFile
+33-0llvm/test/tools/dxil-dis/dbg-declare-undef.ll
+8-10llvm/lib/Target/DirectX/DXILWriter/DXILBitcodeWriter.cpp
+5-4llvm/lib/Target/DirectX/DirectXIRPasses/PointerTypeAnalysis.cpp
+46-143 files