LLVM/project 3c87119llvm/include/llvm/TableGen Main.h, llvm/lib/TableGen Main.cpp

[TableGen][NFCI] Change TableGenMain() to take function_ref. (#167888)

It was switched from a function pointer to std::function in

TableGen: Make 2nd arg MainFn of TableGenMain(argv0, MainFn) optional.
f675ec6165ab6add5e57cd43a2e9fa1a9bc21d81

but there's no mention of any particular reason for that.
DeltaFile
+6-8llvm/include/llvm/TableGen/Main.h
+2-4llvm/lib/TableGen/Main.cpp
+1-1llvm/utils/TableGen/Basic/TableGen.cpp
+9-133 files

LLVM/project 0be4218llvm/cmake/modules TableGen.cmake, llvm/utils/TableGen RegisterInfoEmitter.cpp

[CMake] Declare all parts of *GenRegisterInfo.inc as outputs. (#168405)

This tells the build system to check and regenerate the
*GenRegisterInfo*.inc files, should any of them be missing for
whatever reason.

A follow-up from
<https://github.com/llvm/llvm-project/pull/167700>.
DeltaFile
+11-1llvm/cmake/modules/TableGen.cmake
+2-0llvm/utils/TableGen/RegisterInfoEmitter.cpp
+13-12 files

LLVM/project 22a2caeclang/include/clang/Basic Attr.td, clang/lib/Sema SemaDeclAttr.cpp SemaTemplateInstantiateDecl.cpp

[Clang] Fix cleanup attribute by delaying type checks after the type is deduced (#164440)

Previously, the handling of the `cleanup` attribute had some checks
based on the type, but we were deducing the type after handling the
attribute.
This PR fixes the way the are dealing with type checks for the `cleanup`
attribute by delaying these checks after we are deducing the type.

It is also fixed in a way that the solution can be adapted for other
attributes that does some type based checks.
This is the list of C/C++ attributes that are doing type based checks
and will need to be fixed in additional PRs:
- CUDAShared
- MutualExclusions
- PassObjectSize
- InitPriority
- Sentinel
- AcquireCapability
- RequiresCapability

    [5 lines not shown]
DeltaFile
+25-10clang/lib/Sema/SemaDeclAttr.cpp
+25-0clang/test/SemaCXX/attr-cleanup.cpp
+20-0clang/utils/TableGen/ClangAttrEmitter.cpp
+12-0clang/include/clang/Basic/Attr.td
+10-0clang/test/Sema/type-dependent-attrs.c
+9-0clang/lib/Sema/SemaTemplateInstantiateDecl.cpp
+101-107 files not shown
+140-1013 files

LLVM/project 78746b7llvm/lib/CodeGen RegisterCoalescer.cpp

Address suggestions
DeltaFile
+28-21llvm/lib/CodeGen/RegisterCoalescer.cpp
+28-211 files

LLVM/project 59ed6dfllvm/lib/Target/AArch64 AArch64ISelDAGToDAG.cpp SVEInstrFormats.td, llvm/test/CodeGen/AArch64 sve-vector-splat.ll sve-lrint.ll

[LLVM][CodeGen][SVE] Use DUPM for constantfp splats. (#168391)

This helps cases where the immediate range of FDUP is not sufficient.
DeltaFile
+290-2llvm/test/CodeGen/AArch64/sve-vector-splat.ll
+98-104llvm/test/CodeGen/AArch64/sve-lrint.ll
+98-104llvm/test/CodeGen/AArch64/sve-llrint.ll
+59-66llvm/test/CodeGen/AArch64/sve-fptosi-sat.ll
+37-34llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
+25-0llvm/lib/Target/AArch64/SVEInstrFormats.td
+607-3105 files not shown
+628-34711 files

LLVM/project 4ecfaa6llvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp, llvm/lib/Target/AArch64/GISel AArch64LegalizerInfo.cpp

[AArch64][GlobalISel] Add better basic legalization for llround. (#168427)

This adds handling for f16 and f128 lround/llround under LP64 targets,
promoting the f16 where needed and using a libcall for f128. This
codegen is now identical to the selection dag version.
DeltaFile
+12-0llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+2-6llvm/test/CodeGen/AArch64/lround-conv-fp16.ll
+2-6llvm/test/CodeGen/AArch64/llround-conv-fp16.ll
+5-3llvm/lib/Target/AArch64/GISel/AArch64LegalizerInfo.cpp
+1-4llvm/test/CodeGen/AArch64/lround-conv.ll
+1-4llvm/test/CodeGen/AArch64/llround-conv.ll
+23-236 files

LLVM/project 76dac58mlir/docs/Dialects/NVVM _index.md, mlir/include/mlir/Dialect/LLVMIR NVVMOps.td

[MLIR][NVVM] Move the docs to markdown file (#168375)

DeltaFile
+84-0mlir/docs/Dialects/NVVM/_index.md
+0-78mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+84-782 files

LLVM/project d01926allvm/include/llvm/TableGen Main.h, llvm/lib/TableGen Main.cpp

[TableGen][NFCI] Change TableGenMain() to take function_ref.

It was switched from a function pointer to std::function in

TableGen: Make 2nd arg MainFn of TableGenMain(argv0, MainFn) optional.
f675ec6165ab6add5e57cd43a2e9fa1a9bc21d81

but there's no mention of any particular reason for that.
DeltaFile
+6-8llvm/include/llvm/TableGen/Main.h
+2-4llvm/lib/TableGen/Main.cpp
+1-1llvm/utils/TableGen/Basic/TableGen.cpp
+9-133 files

LLVM/project 3c3dbaallvm/cmake/modules TableGen.cmake, llvm/utils/TableGen RegisterInfoEmitter.cpp

[CMake] Declare all parts of *GenRegisterInfo.inc as outputs.

This tells the build system to check and regenerate the
*GenRegisterInfo*.inc files, should any of them be missing for
whatever reason.

A follow-up from
<https://github.com/llvm/llvm-project/pull/167700>.
DeltaFile
+11-1llvm/cmake/modules/TableGen.cmake
+2-0llvm/utils/TableGen/RegisterInfoEmitter.cpp
+13-12 files

LLVM/project 591c463llvm/include/llvm/IR IntrinsicsAArch64.td, llvm/test/Assembler aarch64-intrinsics-attributes.ll

[LLVM][AArch64] Mark SVE integer intrinsics as speculatable. (#167915)

Exceptions include intrinsics that:
* take or return floating point data
* read or write FFR
* read or write memory
* read or write SME state
DeltaFile
+623-624llvm/include/llvm/IR/IntrinsicsAArch64.td
+73-0llvm/test/Transforms/LICM/AArch64/speculative-intrinsic-hoisting.ll
+2-1llvm/test/Assembler/aarch64-intrinsics-attributes.ll
+698-6253 files

LLVM/project 27231bcmlir/lib/Conversion/SPIRVToLLVM SPIRVToLLVM.cpp, mlir/test/Conversion/SPIRVToLLVM gl-ops-to-llvm.mlir

[MLIR][SPIRV] Lower SPIR-V Tan/Tanh ops to LLVM intrinsics (#168419)

Fixed #148354

Lower SPIR-V Tan/Tanh ops using the corresponding LLVM intrinsics to
reduce instructions and prevent overflow caused by the previous
`exp`-based expansion.
DeltaFile
+4-22mlir/lib/Conversion/SPIRVToLLVM/SPIRVToLLVM.cpp
+2-10mlir/test/Conversion/SPIRVToLLVM/gl-ops-to-llvm.mlir
+6-322 files

LLVM/project 2432465llvm/lib/Transforms/Vectorize VPlan.h, llvm/unittests/Transforms/Vectorize VPlanTest.cpp

[VPlan] Support isa/dyn_cast from VPRecipeBase to VPIRMetadata (NFC). (#166245)

Implement CastInfo from VPRecipeBase to VPIRMetadata to support
isa/dyn_Cast. This is similar to CastInfoVPPhiAccessors, supporting
dyn_cast by down-casting to the concrete recipe types inheriting from
VPIRMetadata.

Can be used for more generalized VPIRMetadata printing following
https://github.com/llvm/llvm-project/pull/165825.

PR: https://github.com/llvm/llvm-project/pull/166245
DeltaFile
+69-0llvm/lib/Transforms/Vectorize/VPlan.h
+14-13llvm/unittests/Transforms/Vectorize/VPlanTest.cpp
+83-132 files

LLVM/project 5efce73compiler-rt/lib/builtins/arm divsf3.S mulsf3.S, compiler-rt/lib/builtins/arm/thumb1 mulsf3.S

[compiler-rt][ARM] Optimized mulsf3 and divsf3 (#168394)

(Reland of #161546, fixing three build and test issues)

This commit adds optimized assembly versions of single-precision float
multiplication and division. Both functions are implemented in a style
that can be assembled as either of Arm and Thumb2; for multiplication, a
separate implementation is provided for Thumb1. Also, extensive new
tests are added for multiplication and division.

These implementations can be removed from the build by defining the
cmake variable COMPILER_RT_ARM_OPTIMIZED_FP=OFF.

Outlying parts of the functionality which are not on the fast path, such
as NaN handling and underflow, are handled in helper functions written
in C. These can be shared between the Arm/Thumb2 and Thumb1
implementations, and also reused by other optimized assembly functions
we hope to add in future.
DeltaFile
+618-0compiler-rt/lib/builtins/arm/divsf3.S
+616-0compiler-rt/test/builtins/Unit/mulsf3_test.c
+408-95compiler-rt/test/builtins/Unit/divsf3_test.c
+319-0compiler-rt/lib/builtins/arm/mulsf3.S
+251-0compiler-rt/lib/builtins/arm/thumb1/mulsf3.S
+78-0compiler-rt/lib/builtins/arm/funder.c
+2,290-955 files not shown
+2,484-9511 files

LLVM/project a46d620llvm/lib/Target/LoongArch LoongArchLateBranchOpt.cpp LoongArchTargetMachine.cpp, llvm/test/CodeGen/LoongArch jr-without-ra.ll opt-pipeline.ll

[LoongArch] Add late branch optimisation pass

This commit adds a new target specific optimization pass for
LoongArch to convert conditional branches into unconditional
branches when the condition can be statically evaluated.

Similar to riscv.
DeltaFile
+195-0llvm/lib/Target/LoongArch/LoongArchLateBranchOpt.cpp
+6-1llvm/lib/Target/LoongArch/LoongArchTargetMachine.cpp
+2-0llvm/lib/Target/LoongArch/LoongArch.h
+0-2llvm/test/CodeGen/LoongArch/jr-without-ra.ll
+1-0llvm/lib/Target/LoongArch/CMakeLists.txt
+1-0llvm/test/CodeGen/LoongArch/opt-pipeline.ll
+205-36 files

LLVM/project 200793allvm/include/llvm/IR Intrinsics.td, llvm/test/Assembler memory-attribute.ll

Extend MemoryEffects to Support Target-Specific Memory Locations (#148650)

This patch introduces preliminary support for additional memory
locations.
They are: target_mem0 and target_mem1 and they model memory locations
that cannot be represented with existing memory locations.

It was a solution suggested in :
https://discourse.llvm.org/t/rfc-improving-fpmr-handling-for-fp8-intrinsics-in-llvm/86868/6

Currently, these locations are not yet target-specific. The goal is to
enable the compiler to express read/write effects on these resources.
DeltaFile
+78-0llvm/test/TableGen/target-mem-intrinsic-attrs.td
+55-0llvm/test/Assembler/memory-attribute.ll
+19-19llvm/test/Transforms/FunctionAttrs/nocapture.ll
+30-1llvm/utils/TableGen/Basic/CodeGenIntrinsics.cpp
+11-11llvm/test/Transforms/FunctionAttrs/argmemonly.ll
+19-0llvm/include/llvm/IR/Intrinsics.td
+212-3120 files not shown
+279-6526 files

LLVM/project fb829bfmlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Add tcgen05.mma MLIR Ops (#164356)

This commit adds support for tgen05.mma family of instructions in the NVVM MLIR dialect and lowers to LLVM Intrinsics. Please refer [PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/#tcgen05-mma-instructions) for information
DeltaFile
+634-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-sp-tensor.mlir
+633-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-tensor.mlir
+612-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+545-0mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+442-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-sp-shared.mlir
+442-0mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-shared.mlir
+3,308-09 files not shown
+4,875-015 files

LLVM/project 8592a65llvm/lib/Target/AArch64 AArch64SystemOperands.td, llvm/lib/Target/AArch64/AsmParser AArch64AsmParser.cpp

[AArch64][llvm] GICv5 instruction `GIC CDEOI` takes no operand (#167322)

There was a minor oversight in commit 6836261ee; the AArch64 GICv5
instruction `GIC CDEOI` takes no operands, since the text of the
specification says:
```
The Rt field should be set to 0b11111. If the Rt field is not
set to 0b11111, it is CONSTRAINED UNPREDICTABLE whether:
* The instruction is UNDEFINED.
* The instruction behaves as if the Rt field is set to 0b11111.
```
DeltaFile
+4-4llvm/test/MC/AArch64/armv9.7a-gcie.s
+4-4llvm/lib/Target/AArch64/AArch64SystemOperands.td
+4-0llvm/test/MC/AArch64/armv9.7a-gcie-diagnostics.s
+1-1llvm/lib/Target/AArch64/MCTargetDesc/AArch64InstPrinter.cpp
+1-1llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+14-105 files

LLVM/project f013974llvm/test/CodeGen/AMDGPU swdev-549940.ll

Remove undef from test (it still preserves the test behavour before and after the fix)
DeltaFile
+1-1llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+1-11 files

LLVM/project cd41eb1llvm/lib/Target/AMDGPU GCNRegPressure.cpp GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir swdev-549940.ll

[AMDGPU] Rematerialize VGPR candidates when SGPR spills to VGPR over the VGPR limit

Before, when selecting candidates to rematerialize, we would only
consider SGPR candidates when there was an excess of SGPR registers.

Failing to eliminate the excess would result in spills to VGPRs.
This is normally not an issue, unless spilling to VGPRs results in
excess VGPRs.

This patch does 2 things:
* It relaxes the GCNRPTarget success criteria: now we accept regions
  where we spill SGPRs to VGPRs, as long as this does not end up in
  excess VGPRs.
* It changes isSaveBeneficial to consider the excess VGPRs (which
  includes the SGPRs that would be spilled to VGPR).

With these changes, the compiler rematerializes VGPRs when the excess
SGPRs would result in VGPR excess.


    [4 lines not shown]
DeltaFile
+30-30llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+15-9llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+3-1llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+1-1llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+1-0llvm/lib/Target/AMDGPU/GCNRegPressure.h
+50-415 files

LLVM/project 3378ea2llvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc/Debugging BUILD.gn

[gn build] Port 3ce893f83450
DeltaFile
+1-0llvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc/Debugging/BUILD.gn
+1-01 files

LLVM/project 88465afllvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc BUILD.gn

[gn build] Port
DeltaFile
+0-1llvm/utils/gn/secondary/llvm/lib/ExecutionEngine/Orc/BUILD.gn
+0-11 files

LLVM/project 49d77d8llvm/lib/Target/X86/GISel X86CallLowering.cpp, llvm/test/CodeGen/X86 isel-arg-attrs.ll

[X86][GlobalISel] Enable nest arguments (#165173)

Nest arguments are supported by CC in X86CallingConv.td. Nothing special
is required in GlobalISel as we reuse the code.

Nest attribute is mostly generated by fortran frontend.
DeltaFile
+23-0llvm/test/CodeGen/X86/isel-arg-attrs.ll
+1-2llvm/lib/Target/X86/GISel/X86CallLowering.cpp
+24-22 files

LLVM/project 3ce893fllvm/include/llvm/ExecutionEngine/Orc DebugObjectManagerPlugin.h, llvm/include/llvm/ExecutionEngine/Orc/Debugging ELFDebugObjectPlugin.h

[ORC] Move DebugObjectManagerPlugin into Debugging/ELFDebugObjectPlugin (NFC) (#168343)

In 4 years the plugin wasn't adapted to other object formats. This patch
makes it specific for ELF, which will allow to remove some abstractions
down the line. It also moves the plugin from LLVMOrcJIT into
LLVMOrcDebugging, which didn't exist back then.
DeltaFile
+0-545llvm/lib/ExecutionEngine/Orc/DebugObjectManagerPlugin.cpp
+543-0llvm/lib/ExecutionEngine/Orc/Debugging/ELFDebugObjectPlugin.cpp
+102-0llvm/include/llvm/ExecutionEngine/Orc/Debugging/ELFDebugObjectPlugin.h
+0-102llvm/include/llvm/ExecutionEngine/Orc/DebugObjectManagerPlugin.h
+3-3llvm/lib/ExecutionEngine/Orc/Debugging/DebuggerSupport.cpp
+3-3llvm/tools/llvm-jitlink/llvm-jitlink.cpp
+651-6536 files not shown
+655-65912 files

LLVM/project e331b29llvm/test/CodeGen/AMDGPU swdev-549940.ll

Unacceptably large test
DeltaFile
+609-0llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+609-01 files

LLVM/project 672757bllvm/lib/Target/WebAssembly WebAssemblyInstrSIMD.td, llvm/test/CodeGen/WebAssembly simd-extadd.ll

[WebAssembly] Add patterns for extadd pairwise (#167960)

Add a few patterns for extadd pairwise.
DeltaFile
+89-0llvm/test/CodeGen/WebAssembly/simd-extadd.ll
+26-0llvm/lib/Target/WebAssembly/WebAssemblyInstrSIMD.td
+115-02 files

LLVM/project 2ea1a09clang/lib/Headers avx512vlintrin.h avx512fintrin.h, clang/test/CodeGen/X86 avx512vl-builtins.c avx512f-builtins.c

[Headers][X86] Allow AVX512 masked arithmetic pd/ps/epi/epu intrinsics to be used in constexpr (#168496)

### Summary
This PR resolves #160559  - other pd/ps/epi/epu part of AVX512 masked arithmetic intrinsics.
DeltaFile
+48-56clang/lib/Headers/avx512vlintrin.h
+72-1clang/test/CodeGen/X86/avx512vl-builtins.c
+24-28clang/lib/Headers/avx512fintrin.h
+36-0clang/test/CodeGen/X86/avx512f-builtins.c
+180-854 files

LLVM/project f9256caclang/lib/AST ExprConstant.cpp, clang/lib/AST/ByteCode InterpBuiltin.cpp

[Headers][X86] Allow AVX512 masked arithmetic ss/sd intrinsics to be used in constexpr (#162816)

This PR just resolves ss/sd part of AVX512 masked arithmetic intrinsics of #160559.
DeltaFile
+30-45clang/lib/Headers/avx512fp16intrin.h
+32-32clang/lib/Headers/avx512fintrin.h
+36-0clang/test/CodeGen/X86/avx512f-builtins.c
+29-0clang/lib/AST/ByteCode/InterpBuiltin.cpp
+25-0clang/lib/AST/ExprConstant.cpp
+14-0clang/test/CodeGen/X86/avx512fp16-builtins.c
+166-773 files not shown
+178-839 files

LLVM/project 754a053bolt/lib/Passes Inliner.cpp, bolt/test/AArch64 inline-armv8.3-tailcall.s

[BOLT] Fix when inlining into a context with a tailcall

When inlining to a call site with a tailcall, the return in the inlined
block does not get removed. Because of this, we don't have to generate
the matching authentication.
Add test for this case.
DeltaFile
+46-0bolt/test/AArch64/inline-armv8.3-tailcall.s
+4-3bolt/lib/Passes/Inliner.cpp
+50-32 files

LLVM/project 128caa1mlir/lib/Dialect/Bufferization/IR BufferizableOpInterface.cpp BufferizationDialect.cpp, mlir/test/Dialect/Bufferization invalid.mlir ops.mlir

[mlir][bufferization] Refine tensor-buffer compatibility checks (#167705)

Generally, to_tensor and to_buffer already perform sufficient
verification. However, there are some unnecessarily strict constraints:
* builtin tensor requires its buffer counterpart to always be memref
* to_buffer on ranked tensor requires to always return memref

These checks are assertions (i.e. preconditions), however, they actually
prevent an apparently useful bufferization where builtin tensors could
become custom buffers. Lift these assertions, maintaining the
verification procedure unchanged, to allow builtin -> custom
bufferizations at operation boundary level.
DeltaFile
+60-0mlir/test/Dialect/Bufferization/invalid.mlir
+37-0mlir/test/Dialect/Bufferization/ops.mlir
+12-6mlir/test/lib/Dialect/Test/TestTypes.cpp
+1-11mlir/lib/Dialect/Bufferization/IR/BufferizableOpInterface.cpp
+0-3mlir/lib/Dialect/Bufferization/IR/BufferizationDialect.cpp
+110-205 files

LLVM/project 8603552llvm/include/llvm/MC/MCParser AsmLexer.h, llvm/lib/MC/MCParser AsmLexer.cpp

[MC] AsmLexer assert buffer is null-terminated at CurBuf.end() (#154972)

AsmLexer expects the buffer it's provided for lexing to be
NULL-terminated, where the NULL terminator is pointed to by
`CurBuf.end()`. However, this expectation isn't explicitly stated
anywhere.

This commit adds a couple of comments as well as an assert as means of
documenting this expectation.
DeltaFile
+7-0llvm/include/llvm/MC/MCParser/AsmLexer.h
+5-0llvm/lib/MC/MCParser/AsmLexer.cpp
+12-02 files