LLVM/project f60f84fllvm/include/llvm/CodeGen CommandFlags.h

Apply suggestions from code review

Co-authored-by: Matt Arsenault <arsenm2 at gmail.com>
DeltaFile
+1-0llvm/include/llvm/CodeGen/CommandFlags.h
+1-01 files

LLVM/project 67ab8b4flang/lib/Optimizer/Transforms/CUDA CUFPredefinedVarToGPU.cpp, flang/test/Fir/CUDA predefined-variables.mlir

[flang][cuda] Fix predefined variable processing with inlining (#205888)

The pass was skipping some variables when they were inlined inside a cuf
kernel for example.
DeltaFile
+83-0flang/test/Fir/CUDA/predefined-variables.mlir
+13-11flang/lib/Optimizer/Transforms/CUDA/CUFPredefinedVarToGPU.cpp
+96-112 files

LLVM/project 53e0a1fllvm/lib/Transforms/Vectorize LoopVectorize.cpp, llvm/test/Transforms/LoopVectorize/AArch64 indvar-overflow-check-scalable-tc.ll

[LV] Use getSmallBestKnownTC in IV-overflow-check (#195226)

It has the benefit of also handling scalable TCs.

Co-authored-by: Florian Hahn <flo at fhahn.com>
DeltaFile
+158-0llvm/test/Transforms/LoopVectorize/AArch64/indvar-overflow-check-scalable-tc.ll
+97-0llvm/test/Transforms/LoopVectorize/RISCV/indvar-overflow-check-scalable-tc.ll
+22-11llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+277-113 files

LLVM/project b05f34cflang/lib/Lower Bridge.cpp, flang/test/Lower/CUDA cuda-data-transfer.cuf

[flang][cuda] Do not add shape when no references are involved (#205919)
DeltaFile
+17-0flang/test/Lower/CUDA/cuda-data-transfer.cuf
+8-3flang/lib/Lower/Bridge.cpp
+25-32 files

LLVM/project 52de711llvm/lib/Target/AArch64 AArch64InstrInfo.cpp, llvm/test/CodeGen/AArch64 tiny-model-static.ll

[AArch64] Fix stack protectors with tiny code model. (#205668)

The tiny code model was using the address of the stack protector as the
stack protector, which doesn't provide the expected protection. Fix it
to use the usual adrp+ldr.
DeltaFile
+79-2llvm/test/CodeGen/AArch64/tiny-model-static.ll
+0-8llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+79-102 files

LLVM/project abe675fclang/test/CIR/CodeGen global-tls-dyn-init.cpp global-init.cpp

[CIR] Fix return type of __cxa_atexit (#205905)

The return type should be 'int', not 'void'. We even have a comment
above the code that generates this that it should be an int.

This patch changes it and updates all the affected tests.
DeltaFile
+6-6clang/test/CIR/CodeGen/global-tls-dyn-init.cpp
+6-6clang/test/CIR/CodeGen/global-init.cpp
+6-6clang/test/CIR/CodeGen/thread-local-in-func.cpp
+4-5clang/test/CIR/CodeGen/global-temp-dtor.cpp
+3-3clang/test/CIR/CodeGen/static-members.cpp
+2-2clang/test/CIR/CodeGen/global-array-dtor.cpp
+27-286 files not shown
+36-3612 files

LLVM/project 16f4acaflang/lib/Semantics check-omp-loop.cpp check-omp-structure.cpp, flang/test/Semantics/OpenMP clause-validity01.f90

Fix merge issues
DeltaFile
+0-6flang/lib/Semantics/check-omp-loop.cpp
+0-4flang/lib/Semantics/check-omp-structure.cpp
+1-1flang/test/Semantics/OpenMP/clause-validity01.f90
+1-113 files

LLVM/project c9ca42fllvm/docs MergeFunctions.md HowToCrossCompileLLVM.md

[docs] Finish MyST migration for selected docs
DeltaFile
+117-120llvm/docs/MergeFunctions.md
+96-96llvm/docs/HowToCrossCompileLLVM.md
+19-20llvm/docs/MyFirstTypoFix.md
+22-16llvm/docs/Vectorizers.md
+14-15llvm/docs/HowToBuildWithPGO.md
+13-10llvm/docs/CycleTerminology.md
+281-27712 files not shown
+330-34218 files

LLVM/project 46cd60bllvm/docs MergeFunctions.md MyFirstTypoFix.md, llvm/docs/Frontend PerformanceTips.md

[docs] Convert selected rst docs with rst2myst
DeltaFile
+279-302llvm/docs/MergeFunctions.md
+194-245llvm/docs/MyFirstTypoFix.md
+208-228llvm/docs/Vectorizers.md
+166-203llvm/docs/MemProf.md
+153-195llvm/docs/Frontend/PerformanceTips.md
+154-153llvm/docs/AdvancedBuilds.md
+1,154-1,32614 files not shown
+2,225-2,49120 files

LLVM/project 4f28579llvm/docs MergeFunctions.rst MergeFunctions.md

[docs] Rename selected rst docs to Markdown
DeltaFile
+0-785llvm/docs/MergeFunctions.rst
+785-0llvm/docs/MergeFunctions.md
+0-522llvm/docs/MyFirstTypoFix.rst
+522-0llvm/docs/MyFirstTypoFix.md
+511-0llvm/docs/Vectorizers.md
+0-511llvm/docs/Vectorizers.rst
+1,818-1,81837 files not shown
+5,461-5,46143 files

LLVM/project b2410caclang/lib/CIR/Dialect/Transforms/TargetLowering CIRABIRewriteContext.cpp, clang/lib/UnifiedSymbolResolution USRGeneration.cpp

Merge branch 'main' into users/kparzysz/c02-directive-wording
DeltaFile
+232-87clang/lib/CIR/Dialect/Transforms/TargetLowering/CIRABIRewriteContext.cpp
+280-0clang/test/CIR/Transforms/abi-lowering/expand-struct-arg.cir
+85-0clang/unittests/Index/IndexTests.cpp
+83-0libcxx/test/libcxx/ranges/range.adaptors/range.take_while/nodiscard.verify.cpp
+53-0libcxx/test/libcxx/ranges/range.adaptors/range.split/nodiscard.verify.cpp
+42-11clang/lib/UnifiedSymbolResolution/USRGeneration.cpp
+775-98138 files not shown
+1,140-511144 files

LLVM/project ddc1588llvm/test/CodeGen/AMDGPU div_v2i128.ll bf16.ll, llvm/test/CodeGen/AMDGPU/GlobalISel udiv.i64.ll urem.i64.ll

Merge branch 'main' into users/kparzysz/unresolved-critical
DeltaFile
+2,592-2,587llvm/test/CodeGen/AMDGPU/div_v2i128.ll
+1,940-1,931llvm/test/CodeGen/AMDGPU/bf16.ll
+1,410-1,359llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll
+1,351-1,351llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll
+899-888llvm/test/CodeGen/AMDGPU/load-constant-i8.ll
+523-1,180llvm/test/CodeGen/AMDGPU/indirect-addressing-si.ll
+8,715-9,296791 files not shown
+29,560-27,059797 files

LLVM/project 9b29fb8flang/test/Semantics cuf05.cuf

[flang] Fix testcase after 0b328bc6a50becb35ebfb158dd3de69bd32c5563 (#205914)
DeltaFile
+5-5flang/test/Semantics/cuf05.cuf
+5-51 files

LLVM/project 5126770clang/lib/CIR/CodeGen CIRGenModule.cpp, clang/test/CIR/CodeGenCUDA address-spaces.cu

[CIR][CUDA] Replace poison attr usages to undef for global shared/shadow/device-shadow instantiation (#204663)
DeltaFile
+17-17clang/test/CIR/CodeGenCUDA/address-spaces.cu
+1-1clang/lib/CIR/CodeGen/CIRGenModule.cpp
+18-182 files

LLVM/project d990688clang/lib/CIR/Dialect/Transforms/TargetLowering CIRABIRewriteContext.cpp CIRABIRewriteContext.h, clang/test/CIR/Transforms/abi-lowering expand-struct-arg.cir

[CIR] Implement ArgKind::Expand in CallConvLowering (#201718)

ArgKind::Expand classifies a struct argument for flattening: each field
becomes a separate scalar argument at the ABI level.  Classic CodeGen
calls this "struct expansion" — used on targets like MIPS and some ARM
calling conventions.

CIRABIRewriteContext previously emitted errorNYI at both classification
sites.  The replacement covers three call paths.  In buildNewArgTypes,
the original struct type is replaced by one wire type per field.  In
insertArgCoercion, the single struct block argument is replaced by N
scalar block arguments and an alloca+get_member+store+load sequence at
the entry block reassembles them for body uses; a running block-argument
index (rather than classIdx + sretOffset) correctly tracks the expanded
slot count when multiple Expand args or sret+Expand combinations appear.
The Ignore-drop loop gains a classToBlockArg pre-computation so that
Ignore args following Expand args are erased at the correct index.  In
rewriteCallSite, cir.extract_member decomposes the struct operand into
its constituent fields, which become separate call arguments.

    [2 lines not shown]
DeltaFile
+232-87clang/lib/CIR/Dialect/Transforms/TargetLowering/CIRABIRewriteContext.cpp
+280-0clang/test/CIR/Transforms/abi-lowering/expand-struct-arg.cir
+2-2clang/lib/CIR/Dialect/Transforms/TargetLowering/CIRABIRewriteContext.h
+514-893 files

LLVM/project a18c09aclang/lib/UnifiedSymbolResolution USRGeneration.cpp, clang/test/Analysis/Scalable/PointerFlow entity-name-no-conflict.cpp

[clang][index][USR] GenLoc prints file entry at most once, allow repeated offsets (#205430)

`GenLoc` previously printed the source location at most once per USR,
gated by a member flag toggled on the first call. During the recursive
visit, if both an outer and an inner decl needed to print the location,
only the outer one was printed. When the outer decl did not need the
offset, no offset was ever printed. For example, the USR of
`Holder<decltype([]{})>::method` depends on the location of the type of
the lambda but the outer decl prints the file entry only, which disables
offset printing.

Change the logic so the file-entry part of the location is printed at
most once (it must be identical), while offsets of sub-decl locations
may be printed multiple times.

rdar://180654884

---------

Co-authored-by: Balázs Benics <benicsbalazs at gmail.com>
DeltaFile
+85-0clang/unittests/Index/IndexTests.cpp
+42-11clang/lib/UnifiedSymbolResolution/USRGeneration.cpp
+2-3clang/test/Analysis/Scalable/PointerFlow/entity-name-no-conflict.cpp
+129-143 files

LLVM/project 1774a65clang/lib/CodeGen CGCUDANV.cpp, clang/test/CodeGenCUDA offload_via_llvm.cu

[Offload] Unify the kernel argument passing (#205224)

Summary:
Currently we have two conflicting methods of passing kernel arguments, a
flat pointer + size and an array of pointers. We recently decided to
move the offload API to the latter because it is more generic and lets
you construct the other formats.

This PR primarily just changes the format and the one existing core use.
The uses should be simplier now. Future changes will change the OpenMP
argument parsing.
DeltaFile
+28-21clang/test/CodeGenCUDA/offload_via_llvm.cu
+3-43offload/plugins-nextgen/cuda/src/rtl.cpp
+8-38offload/plugins-nextgen/level_zero/src/L0Kernel.cpp
+16-27offload/plugins-nextgen/amdgpu/src/rtl.cpp
+13-16clang/lib/CodeGen/CGCUDANV.cpp
+5-7offload/include/Shared/APITypes.h
+73-1526 files not shown
+83-16312 files

LLVM/project 1f46c12clang/lib/Driver/ToolChains Clang.cpp, clang/test/Driver gpu-exceptions.cpp

[Clang] Disable C++ exceptions by default for GPU targets (#205402)

Summary:
Exceptions are not supporter, and likely will never be supported, on
GPUs. The SIMT model makes context switching nearly impossible, and
building an unwinder would require stashing a register file that is
over 8 KiB these days. There's precedent to disable this for the target,
so we should just do this.

Offloading languages have their own weird handling, some chimera that
just accepts exceptions but turns them into traps on the device side, so
we leave that unaffected.
DeltaFile
+26-0clang/test/Driver/gpu-exceptions.cpp
+6-3clang/lib/Driver/ToolChains/Clang.cpp
+32-32 files

LLVM/project 29bf2c4mlir/test/IR visitors.mlir, mlir/test/lib/IR TestVisitors.cpp

[mlir] Fix visitor block erasure (#205854)

I made the visitor test drop block-defined value uses before erasing the
block and added a small regression case.
Fixes #205717
DeltaFile
+26-0mlir/test/IR/visitors.mlir
+1-0mlir/test/lib/IR/TestVisitors.cpp
+27-02 files

LLVM/project 8d758c0flang/test/Semantics cuf05.cuf

[flang] Fix testcase after 0b328bc6a50becb35ebfb158dd3de69bd32c5563
DeltaFile
+5-5flang/test/Semantics/cuf05.cuf
+5-51 files

LLVM/project ad850b8libcxx/include/__ranges split_view.h, libcxx/test/libcxx/ranges/range.adaptors/range.split nodiscard.verify.cpp

[libc++][ranges] Applied `[[nodiscard]]` to `split_view` (#205161)

Towards #172124

# References:

- https://wg21.link/range.split
-
https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant

---------

Co-authored-by: Hristo Hristov <zingam at outlook.com>
DeltaFile
+53-0libcxx/test/libcxx/ranges/range.adaptors/range.split/nodiscard.verify.cpp
+6-6libcxx/include/__ranges/split_view.h
+59-62 files

LLVM/project 9b84d3blibcxx/include/__ranges take_while_view.h, libcxx/test/libcxx/ranges/range.adaptors/range.take_while nodiscard.verify.cpp

[libc++][ranges] Applied `[[nodiscard]]` to `take_while_view` (#205172)

Towards #172124

# References:
- https://wg21.link/range.take_while
-
https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant

Co-authored-by: Hristo Hristov <zingam at outlook.com>
DeltaFile
+83-0libcxx/test/libcxx/ranges/range.adaptors/range.take_while/nodiscard.verify.cpp
+8-8libcxx/include/__ranges/take_while_view.h
+91-82 files

LLVM/project c9f4ba0llvm/test/CodeGen/AMDGPU use-sgpr-multiple-times.ll gep-address-space.ll

AMDGPU: Avoid default subtarget in hand-written codegen tests (9/9) (#205792)

Fix some manual test checks using amdgcn triples without -mcpu. These
require the most careful consideration. The highest impact changes are the
optimizations removing execz branch now that there's a sched model.
DeltaFile
+6-14llvm/test/CodeGen/AMDGPU/use-sgpr-multiple-times.ll
+6-10llvm/test/CodeGen/AMDGPU/gep-address-space.ll
+7-7llvm/test/CodeGen/AMDGPU/setcc.ll
+4-6llvm/test/CodeGen/AMDGPU/si-lower-control-flow-unreachable-block.ll
+4-4llvm/test/CodeGen/AMDGPU/multi-divergent-exit-region.ll
+3-3llvm/test/CodeGen/AMDGPU/schedule-amdgpu-trackers.ll
+30-441 files not shown
+32-467 files

LLVM/project f9aa739llvm/test/CodeGen/AMDGPU wave_dispatch_regs.ll zext-lid.ll

AMDGPU: Avoid default subtarget in hand-written codegen tests (8/9) (#205791)

Introduce the missing -mcpu argument to some tests which are not
autogenerated.

Co-Authored-By: Claude <noreply at anthropic.com> (Claude-Opus-4.8)
DeltaFile
+2-2llvm/test/CodeGen/AMDGPU/wave_dispatch_regs.ll
+2-2llvm/test/CodeGen/AMDGPU/zext-lid.ll
+1-1llvm/test/CodeGen/AMDGPU/vop-shrink.ll
+1-1llvm/test/CodeGen/AMDGPU/waitcnt-trailing.mir
+1-1llvm/test/CodeGen/AMDGPU/waitcnt-no-redundant.mir
+1-1llvm/test/CodeGen/AMDGPU/zext-i64-bit-operand.ll
+8-86 files

LLVM/project 988a1a2llvm/test/CodeGen/AMDGPU trap.ll ran-out-of-registers-errors.ll

AMDGPU: Avoid default subtarget in hand-written codegen tests (7/9) (#205790)

Introduce an -mcpu argument to tests missing it to avoid codegening
the default dummy target. These are cases that didn't require adjusting
the check lines.

Co-Authored-By: Claude <noreply at anthropic.com> (Claude-Opus-4.8)
DeltaFile
+18-18llvm/test/CodeGen/AMDGPU/trap.ll
+5-5llvm/test/CodeGen/AMDGPU/ran-out-of-registers-errors.ll
+4-4llvm/test/CodeGen/AMDGPU/set-wave-priority.ll
+2-2llvm/test/CodeGen/AMDGPU/madak-inline-constant.mir
+2-2llvm/test/CodeGen/AMDGPU/subreg-intervals.mir
+2-2llvm/test/CodeGen/AMDGPU/machinelicm-convergent.mir
+33-3394 files not shown
+142-142100 files

LLVM/project c002097llvm/lib/Transforms/Vectorize VPlan.cpp VPlanHelpers.h

[VPlan] Remove unused VPlanPrinter::getOrCreateName(const VPBlockBase*) (#205896)

The block-name printers use getName()/getUID() directly; this overload
has no callers.
DeltaFile
+0-7llvm/lib/Transforms/Vectorize/VPlan.cpp
+0-2llvm/lib/Transforms/Vectorize/VPlanHelpers.h
+0-92 files

LLVM/project c6608a2clang/test/AST ast-dump-openmp-teams-distribute-parallel-for.c, llvm/test/CodeGen/AMDGPU div_v2i128.ll bf16.ll

Merge branch 'main' into users/kparzysz/c02-directive-wording
DeltaFile
+2,592-2,587llvm/test/CodeGen/AMDGPU/div_v2i128.ll
+1,940-1,931llvm/test/CodeGen/AMDGPU/bf16.ll
+1,410-1,359llvm/test/CodeGen/AMDGPU/GlobalISel/udiv.i64.ll
+1,351-1,351llvm/test/CodeGen/AMDGPU/GlobalISel/urem.i64.ll
+1,701-810llvm/test/CodeGen/AMDGPU/llvm.set.rounding.ll
+0-2,193clang/test/AST/ast-dump-openmp-teams-distribute-parallel-for.c
+8,994-10,2312,239 files not shown
+89,921-85,0272,245 files

LLVM/project 6cdf11cclang-tools-extra/test/clang-doc basic-project.mustache.test, clang-tools-extra/test/clang-doc/html compound-constraints.cpp enum.cpp

Fix whitespace in test files
DeltaFile
+8-8clang-tools-extra/test/clang-doc/html/compound-constraints.cpp
+3-3clang-tools-extra/test/clang-doc/json/class-template.cpp
+2-2clang-tools-extra/test/clang-doc/basic-project.mustache.test
+2-2clang-tools-extra/test/clang-doc/json/templates.cpp
+0-3clang-tools-extra/test/clang-doc/html/enum.cpp
+1-2clang-tools-extra/test/clang-doc/yaml/conversion_function.cpp
+16-2018 files not shown
+26-3924 files

LLVM/project 4b7b7ecflang/lib/Semantics check-omp-structure.cpp

format
DeltaFile
+2-1flang/lib/Semantics/check-omp-structure.cpp
+2-11 files

LLVM/project 0f1abfeflang/lib/Semantics check-omp-structure.cpp check-omp-structure.h, flang/test/Semantics/OpenMP clause-validity01.f90 device-constructs.f90

[flang][OpenMP] Move clause validity checks into OpenMP-specific code (#205607)

The checks for syntactic properties of clauses (e.g. uniqueness, being
required, etc.) were originally handled by infrastructure common to
OpenMP and OpenACC. That infrastructure, however, is not fully equipped
to handle OpenMP needs: being unable to express version-based properties
or clause set properties being two prominent examples.

The first step towards fulfilling the OpenMP requirements it is to
transfer the handling of clause validity checks into OpenMP-specific
code, which can then be modified without interfering with OpenACC.

In addition to that, this PR also changes the way that clauses on end-
directives are handled: first, a clause appearing on an end-directive is
checked to be allowed to appear on an end-directive, then all clauses
from the begin- and the end-directives are tested together. This unifies
checks for uniqueness of clauses that can appear in both places.
DeltaFile
+183-91flang/lib/Semantics/check-omp-structure.cpp
+12-6flang/lib/Semantics/check-omp-structure.h
+10-7flang/test/Semantics/OpenMP/clause-validity01.f90
+10-0llvm/include/llvm/Frontend/OpenMP/OMP.h
+1-1flang/test/Semantics/OpenMP/device-constructs.f90
+1-1flang/test/Semantics/OpenMP/declarative-directive01.f90
+217-1064 files not shown
+221-10910 files