LLVM/project d09fac4libclc/cmake/modules AddLibclc.cmake

[libclc] compile w/o linking builtins with SPIRV backend (#176732)

As we're only building a single file, there is no need to link. This
avoids a dependency on spriv-link when we're using the native SPIRV
backend.
DeltaFile
+1-1libclc/cmake/modules/AddLibclc.cmake
+1-11 files

LLVM/project c4dd363llvm/lib/Target/RISCV RISCVInstrInfoY.td RISCVInstrFormatsY.td, llvm/lib/Target/RISCV/AsmParser RISCVAsmParser.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+242-0llvm/test/MC/RISCV/rvy/rvy-valid-mode-independent.s
+108-0llvm/lib/Target/RISCV/RISCVInstrInfoY.td
+77-0llvm/lib/Target/RISCV/RISCVInstrFormatsY.td
+64-0llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+55-0llvm/lib/Target/RISCV/AsmParser/RISCVAsmParser.cpp
+42-0llvm/test/MC/RISCV/rvy/rvy-invalid-mode-independent.s
+588-09 files not shown
+688-315 files

LLVM/project d451765clang/test/Driver print-supported-extensions-riscv.c, llvm/lib/Target/RISCV RISCVRegisterInfo.td RISCVFeatures.td

[𝘀𝗽𝗿] changes to main this commit is based on

Created using spr 1.3.8-beta.1

[skip ci]
DeltaFile
+64-0llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+22-1llvm/lib/Target/RISCV/RISCVFeatures.td
+2-2llvm/test/MC/RISCV/invalid-attribute.s
+2-0llvm/test/CodeGen/RISCV/features-info.ll
+1-0clang/test/Driver/print-supported-extensions-riscv.c
+1-0llvm/unittests/TargetParser/RISCVISAInfoTest.cpp
+92-36 files

LLVM/project 0017238clang/test/Driver print-supported-extensions-riscv.c, llvm/lib/Target/RISCV RISCVFeatures.td

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+22-1llvm/lib/Target/RISCV/RISCVFeatures.td
+2-2llvm/test/MC/RISCV/invalid-attribute.s
+2-0llvm/test/CodeGen/RISCV/features-info.ll
+1-0clang/test/Driver/print-supported-extensions-riscv.c
+1-0llvm/unittests/TargetParser/RISCVISAInfoTest.cpp
+28-35 files

LLVM/project 556ca6allvm/lib/CodeGen TargetLoweringObjectFileImpl.cpp, llvm/test/Transforms/SampleProfile pseudo-probe-emit-inline.ll pseudo-probe-emit.ll

[CodeGen] Set pseudo probe desc comdat symbol to external for COFF (#176706)

lld-link performs COMDAT sections deduplication only when COMDAT symbol
is external.
DeltaFile
+5-2llvm/test/Transforms/SampleProfile/pseudo-probe-emit-inline.ll
+4-1llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
+2-0llvm/test/Transforms/SampleProfile/pseudo-probe-emit.ll
+11-33 files

LLVM/project 2ad0585llvm/lib/Target/RISCV RISCVRegisterInfo.td

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+64-0llvm/lib/Target/RISCV/RISCVRegisterInfo.td
+64-01 files

LLVM/project 7776eaallvm/lib/Target/X86/MCTargetDesc X86AsmBackend.cpp, llvm/test/MC/X86 x86-jcxz-loop-fixup.s

[X86AsmBackend] Check fixup value overflow (#176827)

GNU Assembler has a generic error checking for overflowed fixup values
```
y.s:5: Error: value of 8000000000000000 too large for field of 4 bytes at 0000000000000004
```

In contrast, we have had an assertion that may fail for a long time.
https://reviews.llvm.org/D70652 improved the status by adding an
overflow check for PC-relative fixups, but missed other cases (#116899).

This patch improves the overflow check to resemble GAS.

For `.long x`, GAS accepts `x` if its value is in the range `(-2**32,
2**32)`. This design allows `.long x` to work regardless of signedness.
When a symbol is involved, GAS supports both `.long sym-0xffffffff` and
`.long sym+1`, as well as `.long sym+0xffffffff` and `.long sym-1`.
However,
`.long sym+0x100000000` is rejected in favor of `.long sym+0`.

    [13 lines not shown]
DeltaFile
+55-0llvm/test/MC/X86/Relocations/fixup-overflow.s
+29-0llvm/test/MC/X86/Relocations/jcxz-loop-overflow.s
+15-13llvm/lib/Target/X86/MCTargetDesc/X86AsmBackend.cpp
+0-26llvm/test/MC/X86/x86-jcxz-loop-fixup.s
+10-0llvm/test/MC/X86/Relocations/fixup-overflow-32.s
+109-395 files

LLVM/project 009e0ccllvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchLASXInstrInfo.td, llvm/test/CodeGen/LoongArch/ir-instruction flog2.ll

Revert "[LoongArch] Lowering flog2 to flogb (#162978)"

This reverts commit d9e5e725ed33e462477d5559ffece0d08e9c8dad.

The semantics of `flog2(x)` and `logb(x)` are different.

Fixes: https://github.com/llvm/llvm-project/issues/176818

Reviewers: zhaoqi5, SixWeining, ylzsx

Pull Request: https://github.com/llvm/llvm-project/pull/176850
DeltaFile
+244-14llvm/test/CodeGen/LoongArch/lasx/ir-instruction/flog2.ll
+142-14llvm/test/CodeGen/LoongArch/lsx/ir-instruction/flog2.ll
+2-8llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+4-4llvm/test/CodeGen/LoongArch/ir-instruction/flog2.ll
+0-3llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+0-3llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+392-462 files not shown
+392-488 files

LLVM/project 5581836llvm/utils/TableGen DAGISelMatcherEmitter.cpp

[TableGen] Remove unused argument from EmitHistogram. NFC
DeltaFile
+3-3llvm/utils/TableGen/DAGISelMatcherEmitter.cpp
+3-31 files

LLVM/project 794d6b0llvm/test/CodeGen/AMDGPU fmul-to-ldexp.ll llvm.log.ll

[AMDGPU] si-peephole-sdwa: Handle V_PACK_B32_F16_e64 (WIP)

Change si-peephole-sdwa to eliminate V_PACK_B32_F16_e64 instructions
by changing the second operand to write to the upper word of the
destination directly.
DeltaFile
+126-140llvm/test/CodeGen/AMDGPU/fmul-to-ldexp.ll
+138-98llvm/test/CodeGen/AMDGPU/llvm.log.ll
+138-98llvm/test/CodeGen/AMDGPU/llvm.log10.ll
+92-104llvm/test/CodeGen/AMDGPU/fpow.ll
+68-127llvm/test/CodeGen/AMDGPU/llvm.log2.ll
+74-118llvm/test/CodeGen/AMDGPU/mad-mix-lo.ll
+636-68529 files not shown
+1,251-1,34835 files

LLVM/project 5fec9fbllvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.sin.f16.ll llvm.cos.f16.ll

[AMDGPU] Enable ISD::{FSIN,FCOS} custom lowering to work on v2f16 (#176382)

Currently ISD::FSIN and ISD::FCOS of type MVT::v2f16 are legalized by
first expanding and then using a custom lowering on the resulting f16
instructions. This ordering prevents using packed math variants of the
instructions introduced by the legalization (e.g. the multiplication) and
makes it difficult to deal with the resulting IR in peephole
optimizations (e.g. si-peephole-sdwa).

Change the legalization action for ISD::FSIN and ISD::FCOS of type
MTF::v2f16 to Custom and change the custom trig lowering to deal
with vectors.
DeltaFile
+27-38llvm/test/CodeGen/AMDGPU/llvm.sin.f16.ll
+27-38llvm/test/CodeGen/AMDGPU/llvm.cos.f16.ll
+18-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+72-793 files

LLVM/project ae1ee0doffload/test/offloading strided_multiple_update_to.c strided_update_to.c

[Offload][Tests] Non-contiguous_update_to_tests (#169623)

PR #144635 enabled non-contiguous updates for both `update from` and
`update to` clauses, but tests for `update to` were missing. This PR
adds those missing tests to ensure coverage.
DeltaFile
+122-0offload/test/offloading/strided_multiple_update_to.c
+72-0offload/test/offloading/strided_update_to.c
+72-0offload/test/offloading/strided_partial_update_to.c
+0-64offload/test/offloading/strided_partial_update.c
+64-0offload/test/offloading/strided_partial_update_from.c
+0-63offload/test/offloading/strided_multiple_update.c
+330-12712 files not shown
+612-29818 files

LLVM/project b965865llvm/utils/TableGen AsmMatcherEmitter.cpp CompressInstEmitter.cpp, llvm/utils/TableGen/Common InfoByHwMode.cpp

clang-format

Created using spr 1.3.8-beta.1
DeltaFile
+2-1llvm/utils/TableGen/AsmMatcherEmitter.cpp
+2-1llvm/utils/TableGen/Common/InfoByHwMode.cpp
+1-1llvm/utils/TableGen/CompressInstEmitter.cpp
+5-33 files

LLVM/project 1201f3fmlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp

remove redundant comment line
DeltaFile
+0-1mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+0-11 files

LLVM/project 30db9e2llvm/test/MC/AMDGPU gfx8_asm_vop3.s gfx7_asm_vop3.s, llvm/test/MC/Disassembler/AMDGPU gfx9_vop3.txt

emit a single function instead of inline lmabdas

Created using spr 1.3.8-beta.1
DeltaFile
+42,349-42,348llvm/test/MC/AMDGPU/gfx8_asm_vop3.s
+41,419-41,418llvm/test/MC/AMDGPU/gfx7_asm_vop3.s
+36,428-36,427llvm/test/MC/AMDGPU/gfx9_asm_vop3.s
+28,175-28,174llvm/test/MC/AMDGPU/gfx9_asm_vopc.s
+22,711-22,884llvm/test/MC/Disassembler/AMDGPU/gfx9_vop3.txt
+22,276-22,275llvm/test/MC/AMDGPU/gfx8_asm_vopc.s
+193,358-193,5264,318 files not shown
+1,272,603-1,150,4184,324 files

LLVM/project 01db9bfflang/lib/Lower/OpenMP OpenMP.cpp ClauseProcessor.cpp, mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td

rename num_threads_vals to num_threads_vars
DeltaFile
+6-6mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+6-6flang/lib/Lower/OpenMP/OpenMP.cpp
+3-3mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+3-3mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+3-3mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+1-1flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+22-226 files

LLVM/project 7b8b23allvm/test/tools/UpdateTestChecks/update_mc_test_checks riscv-show-inst.test, llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs riscv_show_inst.s.expected riscv_show_inst.s

[update_mc_test_checks] Support --show-inst output

This is useful to check that the correct registers were used in cases
where different register classes use the same name in asm input/output.

Pull Request: https://github.com/llvm/llvm-project/pull/174011
DeltaFile
+23-21llvm/utils/update_mc_test_checks.py
+8-0llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/riscv_show_inst.s.expected
+5-0llvm/test/tools/UpdateTestChecks/update_mc_test_checks/riscv-show-inst.test
+2-0llvm/test/tools/UpdateTestChecks/update_mc_test_checks/Inputs/riscv_show_inst.s
+38-214 files

LLVM/project ba4edd8llvm/lib/Target/AMDGPU GCNSubtarget.h SIISelLowering.cpp

[NFCI][AMDGPU] Use X-macro to reduce boilerplate in `GCNSubtarget.h`

`GCNSubtarget.h` contained a large amount of repetitive code following the pattern `bool HasXXX = false;` for member declarations and `bool hasXXX() const { return HasXXX; }` for getters. This boilerplate made the file unnecessarily long and harder to maintain.

This patch introduces an X-macro pattern `GCN_SUBTARGET_HAS_FEATURE` that consolidates 129 simple subtarget features into a single list. The macro is expanded twice: once in the protected section to generate member variable declarations, and once in the public section to generate the corresponding getter methods. This reduces the file by approximately 265 lines while preserving the exact same API and functionality. Features with complex getter logic or inconsistent naming conventions are left as manual implementations for future improvement.

Ideally, these could be generated by TableGen using `GET_SUBTARGETINFO_MACRO`, similar to the X86 backend. However, `AMDGPU.td` has several issues that prevent direct adoption: duplicate field names (e.g., `DumpCode` is set by both `FeatureDumpCode` and `FeatureDumpCodeLower`), and inconsistent naming conventions where many features don't have the `Has` prefix (e.g., `FlatAddressSpace`, `GFX10Insts`, `FP64`). Fixing these issues would require renaming fields in `AMDGPU.td` and updating all references, which is left for future work.
DeltaFile
+256-816llvm/lib/Target/AMDGPU/GCNSubtarget.h
+3-3llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-2llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+1-1llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
+262-8224 files

LLVM/project 35e200cllvm/test/CodeGen/AMDGPU wmma-nop-hoisting.mir

[AMDGPU] Pre-commit test for WMMA NOP hoisting optimization (#176745)

Add test showing current behavior where V_NOP instructions for WMMA
coexecution hazards are inserted inside loop bodies at the use-site. A
future patch will hoist these NOPs to loop preheaders to reduce
redundant execution.

---------

Co-authored-by: Christudasan Devadasan <christudasan.devadasan at amd.com>
DeltaFile
+56-0llvm/test/CodeGen/AMDGPU/wmma-nop-hoisting.mir
+56-01 files

LLVM/project e358638mlir/lib/Dialect/Linalg/Utils Utils.cpp, mlir/test/Dialect/Linalg/convolution roundtrip-convolution.mlir

[Linalg] Support i1 data type in matchConvolutionOpOfType utility (#176704)

-- Extend bodyMatcherForConvolutionOps to recognize arith.ori/arith.andi
   for i1 element types (in addition to add/mul for integer/float types)
   for accumulation and multiplication.
-- Similarly, extend bodyMatcherForSumPoolOps to recognize arith.ori for
   i1 accumulation (in addition to add for integer/float types).

Signed-off-by: Abhishek Varma <abhvarma at amd.com>
DeltaFile
+20-8mlir/lib/Dialect/Linalg/Utils/Utils.cpp
+26-0mlir/test/Dialect/Linalg/convolution/roundtrip-convolution.mlir
+46-82 files

LLVM/project 248dc97flang/lib/Lower/OpenMP OpenMP.cpp ClauseProcessor.cpp, flang/lib/Optimizer/OpenMP LowerWorkdistribute.cpp

rename threadLimitVals to threadLimitVars
DeltaFile
+6-6flang/lib/Lower/OpenMP/OpenMP.cpp
+5-5mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+5-5mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+4-4flang/lib/Optimizer/OpenMP/LowerWorkdistribute.cpp
+3-3mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+1-1flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+24-241 files not shown
+25-257 files

LLVM/project 7dc2cd4clang/lib/AST/ByteCode Interp.h, clang/test/AST/ByteCode shifts.cpp

[clang][bytecode] Handle corner condition for sign negation (#176390)

RHS = -RHS works for most cases, however, the behaviour when RHS is
INTXX_MIN is undefined. In these particular case(s), we should use
INTXX_MAX instead.

Fixes #176271.
DeltaFile
+3-1clang/lib/AST/ByteCode/Interp.h
+2-0clang/test/AST/ByteCode/shifts.cpp
+5-12 files

LLVM/project fab5e02mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td, mlir/lib/Dialect/OpenMP/IR OpenMPDialect.cpp

rename num_teams_vals to num_teams_vars
DeltaFile
+8-8mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+4-4mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+12-122 files

LLVM/project 54ecb7cllvm/lib/Target/LoongArch LoongArchISelLowering.cpp LoongArchLASXInstrInfo.td, llvm/test/CodeGen/LoongArch/ir-instruction flog2.ll

[𝘀𝗽𝗿] initial version

Created using spr 1.3.7
DeltaFile
+244-14llvm/test/CodeGen/LoongArch/lasx/ir-instruction/flog2.ll
+142-14llvm/test/CodeGen/LoongArch/lsx/ir-instruction/flog2.ll
+2-8llvm/lib/Target/LoongArch/LoongArchISelLowering.cpp
+4-4llvm/test/CodeGen/LoongArch/ir-instruction/flog2.ll
+0-3llvm/lib/Target/LoongArch/LoongArchLASXInstrInfo.td
+0-3llvm/lib/Target/LoongArch/LoongArchLSXInstrInfo.td
+392-462 files not shown
+392-488 files

LLVM/project 53ae317clang/docs ReleaseNotes.rst, clang/lib/Sema SemaOverload.cpp

[Clang] Check enable_if attribute without delayed diagnostics (#176080)

We ensure immediate access control checking when evaluating the
enable_if attribute to rule out inaccessible constructors during
potential overload resolution, treating them as SFINAE errors rather
than hard errors, making the behavior more preferable with the nature of
the enable_if attribute.

Compared to the last patch, we now avoid switching the DC directly
because there are cases where we're checking enable_if attribute within
a lambda and getCurLambda() requires a lambda context to distinguish
from template instantiation.

This reapplies #175899

Fixes https://github.com/llvm/llvm-project/issues/175895
DeltaFile
+48-0clang/test/SemaCXX/enable_if.cpp
+6-0clang/lib/Sema/SemaOverload.cpp
+1-0clang/docs/ReleaseNotes.rst
+55-03 files

LLVM/project be40637clang/lib/Sema SemaLambda.cpp, clang/test/SemaCXX cxx2b-consteval-propagate.cpp

[Clang] Ensure a lambda DeclContext in BuildLambdaExpr (#176319)

Since 5f9630b388, we only remove the LSI after the evaluation context is
popped. The TreeTransform of immediate functions may call getCurLambda,
which requires both the paired LSI and the lambda DeclContext. In
TransformLambdaExpr, we already switched the context, but this is not
the case when parsing a lambda expression.

No release note, as this is a regression from 22.

Fixes https://github.com/llvm/llvm-project/issues/176045
DeltaFile
+14-0clang/test/SemaCXX/cxx2b-consteval-propagate.cpp
+6-1clang/lib/Sema/SemaLambda.cpp
+20-12 files

LLVM/project 007f1afllvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU dagcombine-select.ll

[AMDGPU] Use APInt in performSetCCCombine (#176564)

Fixes #176559.
DeltaFile
+79-0llvm/test/CodeGen/AMDGPU/dagcombine-select.ll
+8-8llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+87-82 files

LLVM/project bc3066b.github/workflows release-tasks.yml

workflows/release-lit: Update workflow and enable trusted publishing with pypi (#174907)

This makes some small improvements to the workflow including using some
more modern python packaging modules and also enables the trusted
publishing for pypi. This will allow us to publish lit packages to pypi
without needing to use an access token.

This action also now uses the pypi environment which will only publish
files when triggered by an llvm-* tag.
DeltaFile
+57-0.github/workflows/release-tasks.yml
+57-01 files

LLVM/project 6238ac1llvm/lib/Target/AMDGPU GCNSubtarget.h

[NFCI][AMDGPU] Use X-macro to reduce boilerplate in `GCNSubtarget.h`

`GCNSubtarget.h` contained a large amount of repetitive code following the pattern `bool HasXXX = false;` for member declarations and `bool hasXXX() const { return HasXXX; }` for getters. This boilerplate made the file unnecessarily long and harder to maintain.

This patch introduces an X-macro pattern `GCN_SUBTARGET_HAS_FEATURE` that consolidates 129 simple subtarget features into a single list. The macro is expanded twice: once in the protected section to generate member variable declarations, and once in the public section to generate the corresponding getter methods. This reduces the file by approximately 265 lines while preserving the exact same API and functionality. Features with complex getter logic or inconsistent naming conventions are left as manual implementations for future improvement.

Ideally, these could be generated by TableGen using `GET_SUBTARGETINFO_MACRO`, similar to the X86 backend. However, `AMDGPU.td` has several issues that prevent direct adoption: duplicate field names (e.g., `DumpCode` is set by both `FeatureDumpCode` and `FeatureDumpCodeLower`), and inconsistent naming conventions where many features don't have the `Has` prefix (e.g., `FlatAddressSpace`, `GFX10Insts`, `FP64`). Fixing these issues would require renaming fields in `AMDGPU.td` and updating all references, which is left for future work.
DeltaFile
+250-777llvm/lib/Target/AMDGPU/GCNSubtarget.h
+250-7771 files

LLVM/project 6a6d432llvm/utils/gn/build remove_if_exists.py, llvm/utils/gn/secondary/libcxx/include BUILD.gn

[gn] port 501645cbebf78 (float.h removal)

Also add a little script to clean up incremental builds.
DeltaFile
+30-0llvm/utils/gn/build/remove_if_exists.py
+13-1llvm/utils/gn/secondary/libcxx/include/BUILD.gn
+43-12 files