LLVM/project fc20c57llvm/lib/CodeGen/AsmPrinter DwarfDebug.cpp

[CodeGen] Avoid ambiguous Register comparison in C++20; NFC (#205814)

Fix an "ambiguous overload for ‘operator==’" error when compiling with
`-std=c++20`, caused by C++20's rewritten operator== candidate rules.
DeltaFile
+1-1llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+1-11 files

LLVM/project dd9a822llvm/docs LangRef.rst, llvm/lib/Bitcode/Reader BitcodeReader.cpp

Update for comments
DeltaFile
+10-7llvm/docs/LangRef.rst
+8-0llvm/test/Assembler/invalid-load-store-atomic-elementwise.ll
+1-3llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+19-103 files

LLVM/project 3a25d9fllvm/lib/Transforms/Vectorize VectorCombine.cpp, llvm/test/Transforms/VectorCombine/AMDGPU shuffles-of-length-changing-shuffles.ll

[VectorCombine] Bail out on all-poison leaves in shuffle transform (#206503)

foldShufflesOfLengthChangingShuffles() skips undef sources when
determining Y, so if all the leaves are undef, we can end up with Y
being nullptr after the loop. Bail out in this degenerate case.
DeltaFile
+18-0llvm/test/Transforms/VectorCombine/AMDGPU/shuffles-of-length-changing-shuffles.ll
+4-0llvm/lib/Transforms/Vectorize/VectorCombine.cpp
+22-02 files

LLVM/project 77de767llvm/lib/Target/X86 X86ScheduleC864GM8.td, llvm/test/tools/llvm-mca/X86/C864GM8 resources-avx512vl.s resources-avx512.s

[X86] Hygon C86-4G-M8 Initial enablement (#204587)

This patch adds initial support for Hygon C86-4G-M8 architectures:

- Added C86-4G-M8 CPU targets recognition in Clang and LLVM
- Added C86-4G-M8 to target parser and host CPU detection
- Updated compiler-rt CPU model detection for C86-4G-M8
- Added C86-4G-M8 to various optimizer tests
- Added scheduler models and llvm-mca tests for C86-4G-M8 CPU targets
DeltaFile
+5,297-0llvm/test/tools/llvm-mca/X86/C864GM8/resources-avx512vl.s
+3,733-0llvm/lib/Target/X86/X86ScheduleC864GM8.td
+3,267-0llvm/test/tools/llvm-mca/X86/C864GM8/resources-avx512.s
+2,979-0llvm/test/tools/llvm-mca/X86/C864GM8/resources-avx512bwvl.s
+2,897-0llvm/test/tools/llvm-mca/X86/C864GM8/resources-x86_64.s
+2,449-0llvm/test/tools/llvm-mca/X86/C864GM8/resources-avx1.s
+20,622-088 files not shown
+38,368-094 files

LLVM/project 20100a2llvm/test/CodeGen/Mips/GlobalISel/legalizer zextLoad_and_sextLoad.mir

[Mips][GISel][NFC] Regenerate CHECK lines in zextLoad_and_sextLoad.mir (#206631)
DeltaFile
+55-45llvm/test/CodeGen/Mips/GlobalISel/legalizer/zextLoad_and_sextLoad.mir
+55-451 files

LLVM/project 1d634ebclang/include/clang/AST OpenMPClause.h, clang/lib/Parse ParseOpenMP.cpp

[Clang][OpenMP] Add parsing for dims modifier in num_teams and thread_limit
DeltaFile
+161-70clang/lib/Sema/SemaOpenMP.cpp
+124-0clang/test/OpenMP/dims_modifier_messages.cpp
+63-21clang/lib/Parse/ParseOpenMP.cpp
+62-2clang/include/clang/AST/OpenMPClause.h
+40-0clang/test/OpenMP/dims_modifier_ast_print.cpp
+23-11clang/lib/Sema/TreeTransform.h
+473-10414 files not shown
+596-14920 files

LLVM/project 06cac12clang/lib/AST/ByteCode Pointer.cpp Interp.h, clang/test/AST/ByteCode cxx20.cpp

[clang][bytecode] Use ASTRecordLayout offsets when subtracting pointers (#206496)

What we did here didn't work properly for pointers casted to bases. Add
`Pointer::computeLayoutOffset()` and use that to return the proper
values.
DeltaFile
+98-0clang/lib/AST/ByteCode/Pointer.cpp
+70-0clang/test/AST/ByteCode/cxx20.cpp
+12-16clang/lib/AST/ByteCode/Interp.h
+20-1clang/lib/AST/ByteCode/Descriptor.cpp
+3-0clang/lib/AST/ByteCode/Pointer.h
+1-1clang/lib/AST/ByteCode/Opcodes.td
+204-181 files not shown
+205-197 files

LLVM/project ffa0279llvm/lib/Transforms/Utils LoopUnroll.cpp, llvm/test/Transforms/LoopUnroll runtime-unroll-reductions.ll partial-unroll-reductions.ll

[LoopUnroll] Skip called function in constant-op reduction filter (#200868)

canParallelizeReductionWhenUnrolling iterates the latch instruction's
operands and rejects the reduction if any is a Constant. For calls the
called function is itself a Constant, falsely rejecting every intrinsic
form (fmuladd, smin/smax/umin/umax, etc.). Use CallBase::args() to
restrict the check to data operands.
DeltaFile
+142-0llvm/test/Transforms/LoopUnroll/runtime-unroll-reductions.ll
+62-56llvm/test/Transforms/LoopUnroll/partial-unroll-reductions.ll
+58-40llvm/test/Transforms/LoopUnroll/runtime-unroll-reductions-min-max.ll
+12-9llvm/lib/Transforms/Utils/LoopUnroll.cpp
+274-1054 files

LLVM/project f06918dllvm/lib/CodeGen/SelectionDAG DAGCombiner.cpp, llvm/lib/Target/NVPTX NVPTXISelLowering.h

[DAGCombiner][NVPTX] Avoid forming illegal-typed shuffles after type-legalization (#205056)

Currently, `combineInsertEltToShuffle` could create a shuffle of an
illegal type after type legalization, which when reaches the operation
legalizer, asserts ("Unexpected illegal type!").

https://github.com/llvm/llvm-project/pull/198259 fixed a crash resulting
from this in NVPTX but resulted in regressions with some types due to
the check blocking pre-type-legalization folds in addition to the
illegal post-type-legalization shuffle.

This change removes the TTI override in NVPTX and adds a guard in the
`combineInsertEltToShuffle` pattern to avoid forming illegal-typed
shuffles after type legalization.
DeltaFile
+33-0llvm/test/CodeGen/NVPTX/insert-vector-elt-shuffle-i8.ll
+4-2llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+0-4llvm/lib/Target/NVPTX/NVPTXISelLowering.h
+37-63 files

LLVM/project aba63c5clang/docs ReleaseNotes.md, clang/lib/Sema SemaTypeTraits.cpp

[clang] The `__reference_meows_from_temporary` builtins should SFINAE friendly when the 1st type is not a reference type (#206527)

Suppose that `__reference_constructs_from_temporary` is defined as:

```cpp
__reference_constructs_from_temporary(_Tp, _Up);
```
A non-reference type can never bind to a temporary, so the result is
always `false` for such a `_Tp`. We should short-circuit before reaching
the instantiations by check the type of `_Tp`. But clang's
`__reference_constructs_from_temporary` eagerly instantiates the
construction of `_Up` (including the element's constructor exception
specification) even when `_Tp` is not a reference, which can hard-error
on misbehaved types.

The following code should be accepted, but clang raise a hard error:

```cpp
struct NoConv {};

    [13 lines not shown]
DeltaFile
+10-8clang/lib/Sema/SemaTypeTraits.cpp
+11-1clang/test/SemaCXX/type-traits.cpp
+3-0clang/docs/ReleaseNotes.md
+24-93 files

LLVM/project d08e24aclang/lib/AST ExprConstant.cpp, clang/test/Misc constexpr-source-ranges.cpp

[clang][ExprConst] Add a source range to invalid cast diagnostics (#206456)

Also fix this test to not have absolute line numbers
DeltaFile
+15-7clang/test/Misc/constexpr-source-ranges.cpp
+1-1clang/lib/AST/ExprConstant.cpp
+16-82 files

LLVM/project a93f37bflang/test CMakeLists.txt

[flang][cmake] Order flang profdata generation after clang's (#206023)

The clang and flang PGO pipelines clean and regenerate the same shared
profraw directories, so running them concurrently can truncate a profraw
while the other merge has it mmap'd. Add an ordering edge so flang's
pipeline runs after clang's.

Fixes issues introduced by
https://github.com/llvm/llvm-project/pull/198863
DeltaFile
+11-0flang/test/CMakeLists.txt
+11-01 files

LLVM/project 673ddc9mlir/lib/Dialect/Linalg/TransformOps LinalgTransformOps.cpp, mlir/test/Dialect/Linalg transform-op-rewrite-in-destination-passing-style.mlir

[mlir][linalg] Handle existing destination-passing-style ops in `transform.structured.rewrite_in_destination_passing_style` (#205034)

`transform.structured.rewrite_in_destination_passing_style` may be
applied to an operation that is already in destination-passing style,
e.g. `linalg.add`. In this case, the operation does not need to be
rewritten, but the current `TypeSwitch` does not handle
`DestinationStyleOpInterface` and falls through to the unreachable case.

Such operations can be handled by returning them unchanged. This makes
the transform accept already destination-style operations and avoids the
crash.

An regression test for applying `rewrite_in_destination_passing_style`
is added to `linalg.add`.

Fixes #204099
DeltaFile
+21-0mlir/test/Dialect/Linalg/transform-op-rewrite-in-destination-passing-style.mlir
+1-0mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
+22-02 files

LLVM/project ceed988llvm/lib/Target/WebAssembly WebAssemblyTargetMachine.cpp WebAssemblyAsmPrinter.cpp

[WebAssembly][NFC] Remove direct access to FeatureKV (#206232)

This is preparatory work for changing the representation of
FeatureKV/SubTypeKV, in which they will no longer be that easily
accessible as global variables. Therefore, get them from the subtarget
instead.
DeltaFile
+21-20llvm/lib/Target/WebAssembly/WebAssemblyTargetMachine.cpp
+8-3llvm/lib/Target/WebAssembly/WebAssemblyAsmPrinter.cpp
+0-4llvm/lib/Target/WebAssembly/WebAssemblySubtarget.h
+29-273 files

LLVM/project 3b04eaaflang/lib/Optimizer/Transforms/CUDA CUFDeviceGlobal.cpp, flang/test/Fir/CUDA cuda-device-global-internal-linkage.fir

[flang][cuda] Strip linkage from cloned gpu globals (#206624)
DeltaFile
+19-0flang/test/Fir/CUDA/cuda-device-global-internal-linkage.fir
+6-1flang/lib/Optimizer/Transforms/CUDA/CUFDeviceGlobal.cpp
+25-12 files

LLVM/project 02de6f8clang/lib/AST ExprConstant.cpp, clang/lib/AST/ByteCode Compiler.cpp Context.cpp

[clang][bytecode] Implement support for `Expr::EvaluateWithSubstitution()` (#204781)

This regresses `Sema/enable_if.c`, which now fails when run with the
bytecode interpreter. We also get 14 more diagnostic differences in
`SemaCXX/builtin-object-size-cxx14.cpp`.



Fixes https://github.com/llvm/llvm-project/issues/138473
DeltaFile
+202-0clang/test/AST/ByteCode/enable_if.c
+116-2clang/lib/AST/ByteCode/Compiler.cpp
+27-0clang/lib/AST/ByteCode/Context.cpp
+17-0clang/lib/AST/ByteCode/EvalEmitter.cpp
+11-0clang/lib/AST/ExprConstant.cpp
+9-0clang/lib/AST/ByteCode/EvalEmitter.h
+382-26 files not shown
+400-512 files

LLVM/project 4597411mlir/cmake/modules AddMLIRPython.cmake

[mlir][python][NFC] Clean up nanobind compile options (#206559)

Follow-up to #204230.

Refactor nanobind warning suppression flags into `build_nanobind_lib`.
Drop duplicate RTTI and exception flags.
DeltaFile
+33-20mlir/cmake/modules/AddMLIRPython.cmake
+33-201 files

LLVM/project 442c59cllvm/lib/Transforms/IPO MergeFunctions.cpp, llvm/test/Transforms/MergeFunc merge-functions-entry-count-no-alias.ll merge-functions-entry-count-alias.ll

Revert "[MergeFunctions] Preserve entry counts on folds" (#206640)

Reverts llvm/llvm-project#202218

Causes build failures and needs to be rebased on top of main before
relanding.
DeltaFile
+0-157llvm/test/Transforms/MergeFunc/merge-functions-entry-count-no-alias.ll
+0-51llvm/test/Transforms/MergeFunc/merge-functions-entry-count-alias.ll
+0-25llvm/lib/Transforms/IPO/MergeFunctions.cpp
+0-2333 files

LLVM/project 6600ad0clang/include/clang/Basic BuiltinsRISCV.td, clang/lib/CodeGen/TargetBuiltins RISCV.cpp

[Clang][RISCV] packed reduction sum intrinsics (#206441)

Add the __riscv_predsum/predsumu_* header wrappers over new
__builtin_riscv_* builtins, lowering to the llvm.riscv.predsum/predsumu
intrinsics.
DeltaFile
+254-0clang/test/CodeGen/RISCV/rvp-intrinsics.c
+114-0cross-project-tests/intrinsic-header-tests/riscv_packed_simd.c
+42-0clang/lib/CodeGen/TargetBuiltins/RISCV.cpp
+25-0clang/lib/Headers/riscv_packed_simd.h
+18-0clang/include/clang/Basic/BuiltinsRISCV.td
+453-05 files

LLVM/project 869c459llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV rvp-simd-32.ll rvp-simd-64.ll

[RISCV][P-ext] Avoid redundant accumulator extend for reduction sum (#206430)

For a reduction sum with an i32 accumulator on RV64, the result is
computed at i64 and truncated, so the accumulator's upper bits are
unused. Any-extend it instead of sign-/zero-extending, dropping a
redundant sext.w/zext.w. Follow-up to #206004.
DeltaFile
+5-1llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+0-4llvm/test/CodeGen/RISCV/rvp-simd-32.ll
+0-4llvm/test/CodeGen/RISCV/rvp-simd-64.ll
+5-93 files

LLVM/project 0b664d9llvm/lib/Transforms/Utils FixIrreducible.cpp, llvm/test/Transforms/FixIrreducible pr191979.ll

[FixIrreducible] Handle conditional branch with both successors as header (#206057)

A conditional branch redirecting edges to the cycle header may have both
successors equal to the header (e.g. `br i1 %c, label %h, label %h`),
which the previous `Succ1 = Succ0 ? nullptr : Header` logic mishandled
by dropping the second edge.

Check each successor independently against the header instead.

Fixes https://github.com/llvm/llvm-project/issues/191979.
DeltaFile
+40-0llvm/test/Transforms/FixIrreducible/pr191979.ll
+1-3llvm/lib/Transforms/Utils/FixIrreducible.cpp
+41-32 files

LLVM/project fb20f9fllvm/lib/Transforms/IPO MergeFunctions.cpp, llvm/test/Transforms/MergeFunc merge-functions-entry-count-no-alias.ll merge-functions-entry-count-alias.ll

[MergeFunctions] Preserve entry counts on folds (#202218)

**Summary**

`MergeFunctions` can fold equivalent functions into a single retained
implementation. When that happens, the retained body may be reached by
callers of both original functions, but its `function_entry_count`
metadata previously preserved only one side of the profile data.

For example, folding functions with entry counts `2000` and `1000` could
leave the retained body with only `2000`. This patch updates the
retained implementation after a successful merge, so the entry count
becomes `3000`, using saturating add.

For ODR/double-thunk merges, the private backing body gets the combined
count while the thunks keep their own entry counts. For alias-backed
merges, the backing function carries the combined count.

**AI Assistance Disclosure**

    [3 lines not shown]
DeltaFile
+157-0llvm/test/Transforms/MergeFunc/merge-functions-entry-count-no-alias.ll
+51-0llvm/test/Transforms/MergeFunc/merge-functions-entry-count-alias.ll
+25-0llvm/lib/Transforms/IPO/MergeFunctions.cpp
+233-03 files

LLVM/project bdc1c87llvm/lib/Transforms/IPO ThinLTOBitcodeWriter.cpp LowerTypeTests.cpp, llvm/test/ThinLTO/X86 devirt_function_alias2.ll

[CFI] Create an external linkage alias instead of promoting internals
DeltaFile
+19-33llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
+35-0llvm/lib/Transforms/IPO/LowerTypeTests.cpp
+29-0llvm/test/Transforms/LowerTypeTests/promoted-internal.ll
+20-5llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
+10-7llvm/test/Transforms/ThinLTOBitcodeWriter/comdat.ll
+6-4llvm/test/ThinLTO/X86/devirt_function_alias2.ll
+119-494 files not shown
+130-5610 files

LLVM/project 69d1e4ellvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp SIISelLowering.cpp

[AMDGPU] Guard more intrinsics with target features
DeltaFile
+1-51llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+0-42llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+0-24llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+15-2llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+9-6llvm/lib/Target/AMDGPU/AMDGPU.td
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-store.ll
+29-12915 files not shown
+61-15521 files

LLVM/project 40e6de4llvm/include/llvm/IR IntrinsicsAMDGPU.td, llvm/lib/Target/AMDGPU AMDGPUInstructionSelector.cpp SIISelLowering.cpp

[AMDGPU] Guard more intrinsics with target features
DeltaFile
+1-51llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
+0-42llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+0-24llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
+15-2llvm/include/llvm/IR/IntrinsicsAMDGPU.td
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-store.ll
+4-4llvm/test/CodeGen/AMDGPU/unsupported-av-load.ll
+24-12712 files not shown
+51-14218 files

LLVM/project d3b8ff3llvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchISelLowering.h, llvm/test/CodeGen/LoongArch/ir-instruction double-convert.ll float-convert.ll

Revert "[LoongArch] Custom scalar UINT_TO_FP and FP_TO_UINT with LSX instructions" (#206632)

Reverts llvm/llvm-project#200901

buildbot: https://lab.llvm.org/staging/#/builders/20/builds/28603
DeltaFile
+2-51llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+26-7llvm/test/CodeGen/LoongArch/ir-instruction/double-convert.ll
+25-7llvm/test/CodeGen/LoongArch/ir-instruction/float-convert.ll
+0-1llvm/lib/Target/LoongArch/LoongArchISelLowering.h
+53-664 files

LLVM/project 93da31bclang/lib/CodeGen CodeGenAction.cpp, llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

[RFC][CodeGen] Add generic target feature checks for intrinsics (#201470)

This PR adds target-independent infrastructure for annotating LLVM
intrinsics with required subtarget feature expressions.

It introduces a TargetFeatures string field to intrinsic TableGen
records. TableGen emits an intrinsic-to-feature mapping table.

Both SelectionDAG and GlobalISel now perform this check before lowering
target intrinsics. This allows targets to opt in by annotating intrinsic
definitions directly, rather than adding custom checks during lowering,
legalization, or instruction selection.

This PR uses one AMDGPU intrinsic as an example.
DeltaFile
+96-3llvm/lib/MC/MCSubtargetInfo.cpp
+38-0clang/lib/CodeGen/CodeGenAction.cpp
+33-1llvm/utils/TableGen/Basic/IntrinsicEmitter.cpp
+31-0llvm/lib/IR/DiagnosticInfo.cpp
+28-0llvm/test/TableGen/intrinsic-target-features.td
+25-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+251-414 files not shown
+361-920 files

LLVM/project 2705c08clang/docs LanguageExtensions.rst LanguageExtensions.md, llvm/test/CodeGen/ARM vector-lrint.ll

Merge branch 'main' into users/ikudrin/clang-findallocationfunction-simplify
DeltaFile
+10,260-9,388llvm/test/MC/AMDGPU/gfx11_asm_vopc.s
+0-7,392clang/docs/LanguageExtensions.rst
+0-7,069llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vopc.txt
+6,970-0clang/docs/LanguageExtensions.md
+5,907-0llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vopc-fake16.txt
+1,833-1,841llvm/test/CodeGen/ARM/vector-lrint.ll
+24,970-25,6902,497 files not shown
+124,691-73,1252,503 files

LLVM/project 71113ebclang/lib/Sema SemaExprCXX.cpp

[Sema][NFC] Extract allocation overload diagnostics (#206219)

This extracts the code that emits diagnostics when no viable function is
found for allocation overload resolution to reduce the diff in #203824.
DeltaFile
+68-65clang/lib/Sema/SemaExprCXX.cpp
+68-651 files

LLVM/project efd5fe3clang/lib/Driver/ToolChains Linux.cpp, clang/test/Driver linux-ld.c

[Driver][RISCV] Fix musl dynamic linker path for RISC-V sf/sp ABI (#202513)

Musl adds -sf or -sp suffixes to the path of dynamic linker (e.g.,
ld-musl-riscv64-sf.so.1):


https://git.musl-libc.org/cgit/musl/tree/configure?h=v1.2.6&id=9fa28ece75d8a2191de7c5bb53bed224c5947417#n732

---------

Co-authored-by: Chih-Mao Chen <cmchen at andestech.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply at anthropic.com>
DeltaFile
+18-0clang/test/Driver/linux-ld.c
+8-0clang/lib/Driver/ToolChains/Linux.cpp
+26-02 files