LLVM/project 614e4a7clang/docs ReleaseNotes.rst, clang/lib/Sema SemaTemplate.cpp

[clang] check constant template parameters in dependent contexts

This patch makes sure constant template parameters are checked even
in dependent contexts.

This can for example diagnose narrowings earlier, but this is permitted
as these templates would have no valid instantiations.
DeltaFile
+7-9clang/test/SemaTemplate/temp_arg_template.cpp
+2-3clang/lib/Sema/SemaTemplate.cpp
+2-1clang/docs/ReleaseNotes.rst
+11-133 files

LLVM/project 7cc5989libc/src/__support big_int.h math_extras.h, libc/src/__support/CPP new.h bit.h

[libc] Some MSVC compatibility fixes in src/__support. (#159428)

DeltaFile
+13-13libc/src/__support/big_int.h
+7-1libc/src/__support/math_extras.h
+5-0libc/src/__support/CPP/new.h
+2-1libc/src/__support/CPP/bit.h
+2-0libc/src/__support/CPP/CMakeLists.txt
+29-155 files

LLVM/project 6faad70flang/lib/Optimizer/Builder IntrinsicCall.cpp, flang/test/Lower/Intrinsics pow_complex16i.f90 pow_complex16k.f90

Add fastmath attribute.
Update op description.
Update tests.
DeltaFile
+29-31flang/test/Transforms/convert-complex-pow.fir
+15-3mlir/lib/Dialect/Math/Transforms/AlgebraicSimplification.cpp
+9-5mlir/include/mlir/Dialect/Complex/IR/ComplexOps.td
+6-4flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+1-1flang/test/Lower/Intrinsics/pow_complex16i.f90
+1-1flang/test/Lower/Intrinsics/pow_complex16k.f90
+61-452 files not shown
+63-478 files

LLVM/project 5f105fellvm/docs AMDGPUUsage.rst

[AMDGPU] Update documentation about DWARF registers mapping. NFC (#159447)

DeltaFile
+4-4llvm/docs/AMDGPUUsage.rst
+4-41 files

LLVM/project 221f8eeclang/test/CodeGenOpenCL builtins-amdgcn-gfx1250-cooperative-atomics.cl, llvm/test/CodeGen/AMDGPU llvm.amdgcn.cooperative.atomic-basic.ll

[AMDGPU] Add gfx1251 runlines to cooperative atomcis tests. NFC (#159437)

DeltaFile
+33-23llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cooperative.atomic-basic.ll
+1-0llvm/test/Verifier/AMDGPU/llvm.amdgcn.cooperative.atomic.ll
+1-0clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250-cooperative-atomics.cl
+35-233 files

LLVM/project 4a8008cllvm/lib/Target/AMDGPU AMDGPUSearchableTables.td, llvm/test/Analysis/UniformityAnalysis/AMDGPU always_uniform.ll

[AMDGPU] Mark cluster_workgroup_id_* intrinsics always uniform (#159439)

DeltaFile
+72-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/always_uniform.ll
+8-0llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
+80-02 files

LLVM/project 1c4c7bdllvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp DAGCombiner.cpp, llvm/test/CodeGen/AArch64 concat-vector-add-combine.ll

[SelectionDAG] Deal with POISON for INSERT_VECTOR_ELT/INSERT_SUBVECTOR (#143102)

As reported in https://github.com/llvm/llvm-project/issues/141034
SelectionDAG::getNode had some unexpected
behaviors when trying to create vectors with UNDEF elements. Since
we treat both UNDEF and POISON as undefined (when using isUndef())
we can't just fold away INSERT_VECTOR_ELT/INSERT_SUBVECTOR based on
isUndef(), as that could make the resulting vector more poisonous.

Same kind of bug existed in DAGCombiner::visitINSERT_SUBVECTOR.

Here are some examples:

This fold was done even if vec[idx] was POISON:
  INSERT_VECTOR_ELT vec, UNDEF, idx -> vec

This fold was done even if any of vec[idx..idx+size] was POISON:
  INSERT_SUBVECTOR vec, UNDEF, idx -> vec


    [8 lines not shown]
DeltaFile
+60-30llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vfw-web-simplification.ll
+73-10llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+38-30llvm/test/CodeGen/X86/vector-shuffle-combining.ll
+36-19llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vw-web-simplification.ll
+31-7llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+5-7llvm/test/CodeGen/AArch64/concat-vector-add-combine.ll
+243-1035 files not shown
+262-10811 files

LLVM/project 96f2ab2llvm/lib/Target/RISCV RISCVSchedSiFive7.td

[RISCV][NFC] Merge some WriteRes entries in SiFive7 scheduling model (#159448)

NFC.
DeltaFile
+6-14llvm/lib/Target/RISCV/RISCVSchedSiFive7.td
+6-141 files

LLVM/project 52182f1flang/lib/Optimizer/Transforms ConvertComplexPow.cpp

Remove unused function.
DeltaFile
+0-7flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
+0-71 files

LLVM/project 8f71488mlir/lib/Conversion/ComplexToROCDLLibraryCalls ComplexToROCDLLibraryCalls.cpp

Fix clang-format.
DeltaFile
+2-3mlir/lib/Conversion/ComplexToROCDLLibraryCalls/ComplexToROCDLLibraryCalls.cpp
+2-31 files

LLVM/project 6976910flang/lib/Optimizer/Builder IntrinsicCall.cpp, flang/lib/Optimizer/Transforms ConvertComplexPow.cpp

Add complex.powi op.
DeltaFile
+45-51flang/lib/Optimizer/Transforms/ConvertComplexPow.cpp
+37-4mlir/lib/Conversion/ComplexToROCDLLibraryCalls/ComplexToROCDLLibraryCalls.cpp
+26-0mlir/include/mlir/Dialect/Complex/IR/ComplexOps.td
+17-7mlir/lib/Dialect/Math/Transforms/AlgebraicSimplification.cpp
+13-7flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+20-0mlir/test/Dialect/Complex/powi-simplify.mlir
+158-697 files not shown
+189-7713 files

LLVM/project 2632942flang/lib/Frontend FrontendActions.cpp, flang/test/Transforms convert-complex-pow.fir

Propagate fastmath in ComplexToROCDL.
Fix Targetmachine build error.
DeltaFile
+68-59flang/test/Transforms/convert-complex-pow.fir
+6-3mlir/lib/Conversion/ComplexToROCDLLibraryCalls/ComplexToROCDLLibraryCalls.cpp
+2-2flang/lib/Frontend/FrontendActions.cpp
+76-643 files

LLVM/project 5bb542fllvm/docs AMDGPUUsage.rst

[AMDGPU] Update documentation about DWARF registers mapping. NFC
DeltaFile
+4-4llvm/docs/AMDGPUUsage.rst
+4-41 files

LLVM/project 6c8fcd6llvm/include/llvm/MC MCSection.h, llvm/lib/MC MCSFrame.cpp MCAssembler.cpp

Revert "[SFrames] Emit and relax FREs (#158154)" (#159436)

Breaks some buildbots

This reverts commit c9285166214db4236f26312f68bba91f6437bd6f.
DeltaFile
+16-156llvm/lib/MC/MCSFrame.cpp
+0-114llvm/test/MC/ELF/cfi-sframe-fre-cases.s
+0-87llvm/test/MC/ELF/cfi-sframe-encoding.s
+27-32llvm/test/MC/ELF/cfi-sframe.s
+0-25llvm/include/llvm/MC/MCSection.h
+0-24llvm/lib/MC/MCAssembler.cpp
+43-4387 files not shown
+44-47613 files

LLVM/project be62583llvm/test/CodeGen/AMDGPU memory-legalizer-flat-cluster.ll memory-legalizer-global-cluster.ll, llvm/test/CodeGen/PowerPC vector-popcnt-128-ult-ugt.ll

Merge branch 'main' into users/bjope/insertundef_2
DeltaFile
+22,442-22,438llvm/test/CodeGen/PowerPC/vector-popcnt-128-ult-ugt.ll
+25,726-0llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-cluster.ll
+23,810-0llvm/test/CodeGen/AMDGPU/memory-legalizer-global-cluster.ll
+23,347-0llvm/test/CodeGen/AMDGPU/memory-legalizer-private-cluster.ll
+21,350-0llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll
+15,306-3,325llvm/test/CodeGen/AMDGPU/frem.ll
+131,981-25,7638,508 files not shown
+580,794-229,4958,514 files

LLVM/project c3d22d0llvm/include/llvm/BinaryFormat ELF.h, llvm/include/llvm/IR PatternMatch.h

Merge branch 'main' into users/rampitec/09-17-_amdgpu_add_gfx1251_runlines_to_cooperative_atomcis_tests._nfc
DeltaFile
+54-12llvm/tools/llvm-readobj/ELFDumper.cpp
+14-12llvm/include/llvm/BinaryFormat/ELF.h
+6-10llvm/lib/Analysis/InstructionSimplify.cpp
+13-0llvm/include/llvm/IR/PatternMatch.h
+6-7llvm/include/llvm/Support/FormatVariadicDetails.h
+4-6mlir/include/mlir/Conversion/ArithToAMDGPU/ArithToAMDGPU.h
+97-4717 files not shown
+151-8923 files

LLVM/project 7fb3a91llvm/include/llvm/IR PatternMatch.h, llvm/lib/Analysis InstructionSimplify.cpp

[PatternMatch] Introduce match functor (NFC) (#159386)

A common idiom is the usage of the PatternMatch match function within a
functional algorithm like all_of. Introduce a match functor to shorten
this idiom.

Co-authored-by: Luke Lau <luke at igalia.com>
DeltaFile
+6-10llvm/lib/Analysis/InstructionSimplify.cpp
+13-0llvm/include/llvm/IR/PatternMatch.h
+4-4llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+2-6llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+3-4llvm/lib/Transforms/Scalar/LICM.cpp
+3-4llvm/lib/CodeGen/InterleavedAccessPass.cpp
+31-285 files not shown
+38-3811 files

LLVM/project 835d6b3llvm/utils profcheck-xfail.txt

[profcheck] exclude LV test introduced in #155547 (#159443)

DeltaFile
+1-0llvm/utils/profcheck-xfail.txt
+1-01 files

LLVM/project e556dc0clang/test/Driver amdgpu-macros.cl, llvm/docs AMDGPUUsage.rst

[AMDGPU] Add gfx1251 subtarget (#159430)

DeltaFile
+346-0llvm/test/MC/AMDGPU/hsa-gfx1251-v4.s
+16-0llvm/docs/AMDGPUUsage.rst
+9-0llvm/test/tools/llvm-readobj/ELF/AMDGPU/elf-headers.test
+7-0clang/test/Driver/amdgpu-macros.cl
+7-0llvm/test/Object/AMDGPU/elf-header-flags-mach.yaml
+5-0llvm/test/tools/llvm-objdump/ELF/AMDGPU/subtarget.ll
+390-021 files not shown
+427-127 files

LLVM/project f992b5bllvm/utils profcheck-xfail.txt

[profcheck] exclude test introduced in #158328 (#159441)

DeltaFile
+1-0llvm/utils/profcheck-xfail.txt
+1-01 files

LLVM/project 7dc8753llvm/utils profcheck-xfail.txt

[profcheck] Exclude LoopVectorize tests introduced in #155301 (#159440)

DeltaFile
+2-0llvm/utils/profcheck-xfail.txt
+2-01 files

LLVM/project f1d9f8ellvm/lib/Transforms/Scalar IndVarSimplify.cpp

fixup

Created using spr 1.3.4
DeltaFile
+1-1llvm/lib/Transforms/Scalar/IndVarSimplify.cpp
+1-11 files

LLVM/project a0bfcb0llvm/test/CodeGen/AMDGPU memory-legalizer-flat-cluster.ll memory-legalizer-global-cluster.ll, llvm/test/CodeGen/PowerPC vector-popcnt-128-ult-ugt.ll

do not hardcode traps

Created using spr 1.3.4
DeltaFile
+22,442-22,438llvm/test/CodeGen/PowerPC/vector-popcnt-128-ult-ugt.ll
+40,677-0llvm/test/CodeGen/RISCV/rvv/nontemporal-vp-scalable.ll
+25,726-0llvm/test/CodeGen/AMDGPU/memory-legalizer-flat-cluster.ll
+23,810-0llvm/test/CodeGen/AMDGPU/memory-legalizer-global-cluster.ll
+23,565-0llvm/test/CodeGen/AMDGPU/memory-legalizer-private-workgroup.ll
+23,413-0llvm/test/CodeGen/AMDGPU/memory-legalizer-private-singlethread.ll
+159,633-22,4387,270 files not shown
+710,174-191,2047,276 files

LLVM/project b357ab5llvm/lib/Target/AMDGPU AMDGPUSearchableTables.td, llvm/test/Analysis/UniformityAnalysis/AMDGPU always_uniform.ll

[AMDGPU] Mark cluster_workgroup_id_* intrinsics always uniform
DeltaFile
+72-0llvm/test/Analysis/UniformityAnalysis/AMDGPU/always_uniform.ll
+8-0llvm/lib/Target/AMDGPU/AMDGPUSearchableTables.td
+80-02 files

LLVM/project 1b3d480clang/test/CodeGenOpenCL builtins-amdgcn-gfx1250-cooperative-atomics.cl, llvm/test/CodeGen/AMDGPU llvm.amdgcn.cooperative.atomic-basic.ll

[AMDGPU] Add gfx1251 runlines to cooperative atomcis tests. NFC
DeltaFile
+33-23llvm/test/CodeGen/AMDGPU/llvm.amdgcn.cooperative.atomic-basic.ll
+1-0clang/test/CodeGenOpenCL/builtins-amdgcn-gfx1250-cooperative-atomics.cl
+1-0llvm/test/Verifier/AMDGPU/llvm.amdgcn.cooperative.atomic.ll
+35-233 files

LLVM/project dffd7f3llvm/include/llvm/BinaryFormat ELF.h, llvm/lib/Object ELFObjectFile.cpp

[LLVM] Fix offload and update CUDA ABI for all SM values (#159354)

Summary:
Turns out the new CUDA ABI now applies retroactively to all the other
SMs if you upgrade to CUDA 13.0. This patch changes the scheme, keeping
all the SM flags consistent but using an offset.

Fixes: https://github.com/llvm/llvm-project/issues/159088
DeltaFile
+54-12llvm/tools/llvm-readobj/ELFDumper.cpp
+14-12llvm/include/llvm/BinaryFormat/ELF.h
+7-1llvm/lib/Object/ELFObjectFile.cpp
+1-1offload/plugins-nextgen/cuda/src/rtl.cpp
+76-264 files

LLVM/project 5cb7bf6llvm/include/llvm/Support FormatVariadicDetails.h

[Support] Simplify has_StreamOperator (NFC) (#159242)

Without this patch, we are doing a roundtrip on types.  Specifically,
if decltype(...) is well formed, std::is_same_v evaluates to a boolean
value.  We then pass the boolean value to std::enable_if_t, go through
the sizeof(char)/sizeof(double) trick, and then come back to a boolean
value.

This patch simplifies all this by having test() return
std::is_same<...>.  The "caller" attaches ::value, so effectively we
are using std::is_same<...>::value when decltype(...) is well formed,
bypassing std::enable_if_t and the sizeof(char)/sizeof(double) trick.

If we did not care about the return type of the shift operator, we
could use llvm::is_detected, but the return type check doesn't allow
us to simplify things that far.
DeltaFile
+6-7llvm/include/llvm/Support/FormatVariadicDetails.h
+6-71 files

LLVM/project 2a25f3abolt/include/bolt/Core MCInstUtils.h, bolt/lib/Core MCInstUtils.cpp

Use Index inside the BB instead of direct MCInst* in RefInBB
DeltaFile
+26-14bolt/include/bolt/Core/MCInstUtils.h
+9-13bolt/lib/Core/MCInstUtils.cpp
+35-272 files

LLVM/project 0d989b2clang/lib/Sema SemaStmtAttr.cpp SemaDeclAttr.cpp

[clang] Avoid warnings about enum mismatch in ternary expressions. NFC. (#159338)

This avoids the following kind of warning when built with GCC:

    ../../clang/lib/Sema/SemaStmtAttr.cpp: In function ‘clang::Attr* ProcessStmtAttribute(clang::Sema&, clang::Stmt*, const clang::ParsedAttr&, clang::SourceRange)’:
    ../../clang/lib/Sema/SemaStmtAttr.cpp:677:30: warning: enumerated mismatch in conditional expression: ‘clang::diag::<unnamed enum>’ vs ‘clang::diag::<unnamed enum>’ [-Wenum-compare]
      676 |       S.Diag(A.getLoc(), A.isRegularKeywordAttribute()
          |                          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      677 |                              ? diag::err_keyword_not_supported_on_targe
          |                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      678 |                              : diag::warn_unhandled_ms_attribute_ignore )
          |                              ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

These enums are non-overlapping, but due they are defined in different
enum scopes due to how they are generated with tablegen.
DeltaFile
+5-4clang/lib/Sema/SemaStmtAttr.cpp
+5-4clang/lib/Sema/SemaDeclAttr.cpp
+10-82 files

LLVM/project aa5558dmlir/include/mlir/Conversion/ArithToAMDGPU ArithToAMDGPU.h, mlir/lib/Conversion/ArithToAMDGPU ArithToAMDGPU.cpp

[mlir][ArithToAMDGPU] limit scaling truncf/extf support to gfx950 (#155431)

The current chip guard fails to prevent scaling_extf/truncf patterns
from being applied on gfx1100 which does not have scaling support.

---------

Signed-off-by: Muzammiluddin Syed <muzasyed at amd.com>
DeltaFile
+4-6mlir/include/mlir/Conversion/ArithToAMDGPU/ArithToAMDGPU.h
+5-4mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp
+4-0mlir/test/Conversion/ArithToAMDGPU/scaling-truncf.mlir
+4-0mlir/test/Conversion/ArithToAMDGPU/scaling-extf.mlir
+17-104 files