LLVM/project 8024723clang/include/clang/Basic BuiltinsAMDGPU.def, clang/lib/CodeGen/TargetBuiltins AMDGPU.cpp

[AMDGPU] Add builtins for wave reduction intrinsics
DeltaFile
+84-0clang/test/CodeGenOpenCL/builtins-amdgcn.cl
+8-0clang/lib/CodeGen/TargetBuiltins/AMDGPU.cpp
+4-0clang/include/clang/Basic/BuiltinsAMDGPU.def
+96-03 files

LLVM/project 8d331d5llvm/docs AMDGPUUsage.rst

[AMDGPU] Update documentation for wave reduction intrinsics
DeltaFile
+72-4llvm/docs/AMDGPUUsage.rst
+72-41 files

LLVM/project 019caaellvm/lib/Target/AMDGPU SIISelLowering.cpp

Use pseudo opcode for switch statements
DeltaFile
+10-10llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+10-101 files

LLVM/project 99ec779llvm/lib/Target/AMDGPU SIISelLowering.cpp

Use enum values for src modifiers.
DeltaFile
+8-8llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+8-81 files

LLVM/project 28d75bcllvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fadd.ll llvm.amdgcn.reduce.fsub.ll

[AMDGPU] Add wave reduce intrinsics for double types - 2

Supported Ops: `add`, `sub`
DeltaFile
+1,115-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fadd.ll
+1,102-0llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fsub.ll
+80-19llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+2-0llvm/lib/Target/AMDGPU/SIInstructions.td
+2,299-194 files

LLVM/project d607129llvm/lib/Target/AMDGPU SIISelLowering.cpp SIInstructions.td, llvm/test/CodeGen/AMDGPU v_sub_f64_pseudo.mir

[AMDGPU] Introduce `v_sub_f64_pseudo` instruction
DeltaFile
+104-0llvm/test/CodeGen/AMDGPU/v_sub_f64_pseudo.mir
+26-0llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+4-0llvm/lib/Target/AMDGPU/SIInstructions.td
+134-03 files

LLVM/project 8b2c247llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU llvm.amdgcn.reduce.fmax.ll llvm.amdgcn.reduce.fmin.ll

Remove NAN Canonicalization
DeltaFile
+142-199llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmax.ll
+142-199llvm/test/CodeGen/AMDGPU/llvm.amdgcn.reduce.fmin.ll
+2-26llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+286-4243 files

LLVM/project ddb8385flang/lib/Lower/OpenMP ClauseProcessor.cpp Clauses.cpp, flang/test/Lower/OpenMP num-teams-dims.f90

[FLANG] Add flang to mlir lowering for num_teams
DeltaFile
+52-0flang/test/Lower/OpenMP/num-teams-dims.f90
+25-9flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+23-4flang/lib/Lower/OpenMP/Clauses.cpp
+14-3flang/lib/Lower/OpenMP/OpenMP.cpp
+114-164 files

LLVM/project c30c2f4llvm/lib/Target/AMDGPU GCNRegPressure.cpp GCNSchedStrategy.cpp, llvm/test/CodeGen/AMDGPU swdev-549940.ll

[AMDGPU] Rematerialize VGPR candidates when SGPR spills results in VGPR Excess (#168079)

Before, when selecting candidates to rematerialize, we would only
consider SGPR candidates when there was an excess of SGPR registers.

Failing to eliminate the excess would result in spills to VGPRs.
This is normally not an issue, unless spilling to VGPRs results in
excess VGPRs.

This patch does 2 things:
* It relaxes the GCNRPTarget success criteria: now we accept regions
  where we spill SGPRs to VGPRs, as long as this does not end up in
  excess VGPRs.
* It changes isSaveBeneficial to consider the excess VGPRs (which
  includes the SGPRs that would be spilled to VGPR).

With these changes, the compiler rematerializes VGPRs when the excess
SGPRs would result in VGPR excess.


    [4 lines not shown]
DeltaFile
+609-0llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+66-54llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+12-1llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+6-0llvm/lib/Target/AMDGPU/GCNRegPressure.h
+693-554 files

LLVM/project 3b1d7edclang/lib/CIR/Dialect/IR CIRDialect.cpp, mlir/include/mlir/Interfaces ControlFlowInterfaces.h

[mlir][Interfaces] Split successor inputs from region successor
DeltaFile
+52-12clang/lib/CIR/Dialect/IR/CIRDialect.cpp
+43-20mlir/lib/Dialect/SCF/IR/SCF.cpp
+43-12mlir/test/lib/Dialect/Test/TestOpDefs.cpp
+12-30mlir/include/mlir/Interfaces/ControlFlowInterfaces.h
+28-11mlir/lib/Dialect/Transform/IR/TransformOps.cpp
+36-3mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
+214-8836 files not shown
+446-20742 files

LLVM/project ce93961llvm/docs/CommandGuide llvm-readobj.rst llvm-readelf.rst, llvm/test/tools/llvm-readobj/ELF call-graph-info.test call-graph-info-warn-malformed.test

Revert "[llvm-readobj] Dump callgraph section info for ELF" (#176221)

Reverts llvm/llvm-project#157499

Following up on discourse post. Reverting this patch and will attempt a
reland addressing post merge comments.

(cherry picked from commit 97576a86eb25696a2b57cd42370991b172c08405)
DeltaFile
+0-521llvm/test/tools/llvm-readobj/ELF/call-graph-info.test
+0-366llvm/test/tools/llvm-readobj/ELF/call-graph-info-warn-malformed.test
+1-277llvm/tools/llvm-readobj/ELFDumper.cpp
+17-25llvm/docs/CommandGuide/llvm-readobj.rst
+4-12llvm/docs/CommandGuide/llvm-readelf.rst
+0-4llvm/tools/llvm-readobj/llvm-readobj.cpp
+22-1,2052 files not shown
+22-1,2078 files

LLVM/project 42d7ed3flang/include/flang/Optimizer/Dialect/MIF MIFOps.td, flang/lib/Lower Bridge.cpp MultiImageFortran.cpp

[flang] Fix crash with coarray teams #171048 (#172259)

This PR updates the `CHANGE TEAM` construct to fix the bug mentioned in
the issue #171048.
When a construct such as `IfConstruct` was present in the `CHANGE TEAM`
region, several BB were created but outside the region.

(cherry picked from commit 1d4f9ac37c043198d823e85e3cd777dc970d8b75)
DeltaFile
+40-20flang/lib/Lower/Bridge.cpp
+29-0flang/test/Lower/MIF/change_team2.f90
+6-11flang/lib/Lower/MultiImageFortran.cpp
+6-7flang/lib/Optimizer/Dialect/MIF/MIFOps.cpp
+6-6flang/include/flang/Optimizer/Dialect/MIF/MIFOps.td
+5-3flang/lib/Optimizer/Builder/IntrinsicCall.cpp
+92-471 files not shown
+95-497 files

LLVM/project 777e29dlldb/test/API/tools/lldb-dap/attach TestDAP_attach.py, lldb/test/API/tools/lldb-dap/startDebugging TestDAP_startDebugging.py

[lldb-dap] Move targetId and debuggerId into a session property (#175930)

This makes it clear the fields required for attaching to an existing
debug session.

It also makes it easier to check mutually exclusive fields required to
attach.

(cherry picked from commit 6977e6812c3e2027f0f427506ee151011f1e55bb)
DeltaFile
+13-11lldb/tools/lldb-dap/Protocol/ProtocolRequests.cpp
+15-3lldb/tools/lldb-dap/extension/package.json
+8-9lldb/tools/lldb-dap/Handler/AttachRequestHandler.cpp
+9-8lldb/test/API/tools/lldb-dap/startDebugging/TestDAP_startDebugging.py
+11-5lldb/tools/lldb-dap/Protocol/ProtocolRequests.h
+7-5lldb/test/API/tools/lldb-dap/attach/TestDAP_attach.py
+63-415 files not shown
+86-6011 files

LLVM/project d16413allvm/lib/ExecutionEngine/JITLink MachO_x86_64.cpp MachO_arm64.cpp

[JITLink][CompactUnwind] Explicitly enumerate mergeable encodings. NFCI. (#176317)

Updates CompactUnwindTraits_MachO_arm64 and
CompactUnwindTraits_MachO_x86_64 encodingCanBeMerged methods to use
switch statements that clearly list mergeable encodings, and have a
default "false" case.

Since the new scheme explicitly covers DWARF modes (always
non-mergeable), this patch removes the separate DWARF mode check from
mergeRecords in CompactUnwindSupport.h.
DeltaFile
+14-3llvm/lib/ExecutionEngine/JITLink/MachO_x86_64.cpp
+14-2llvm/lib/ExecutionEngine/JITLink/MachO_arm64.cpp
+2-3llvm/lib/ExecutionEngine/JITLink/CompactUnwindSupport.h
+30-83 files

LLVM/project 12c4749flang/lib/Lower/OpenMP OpenMP.cpp, mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td

Remove dims(N) syntax and use list of vals for num_threads
DeltaFile
+13-38mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+22-27mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+0-31mlir/test/Dialect/OpenMP/invalid.mlir
+11-15mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+6-8flang/lib/Lower/OpenMP/OpenMP.cpp
+9-2mlir/test/Dialect/OpenMP/ops.mlir
+61-1213 files not shown
+74-1249 files

LLVM/project d34ffc9llvm/lib/Target/RISCV RISCVInstrInfoF.td RISCVInstrInfoZfh.td, llvm/test/CodeGen/RISCV fp-fcanonicalize.ll

[RISCV] Fix fcanonicalize for Z*inx (#175984)

DeltaFile
+2,260-0llvm/test/CodeGen/RISCV/fp-fcanonicalize.ll
+4-2llvm/lib/Target/RISCV/RISCVInstrInfoF.td
+4-2llvm/lib/Target/RISCV/RISCVInstrInfoZfh.td
+4-2llvm/lib/Target/RISCV/RISCVInstrInfoD.td
+2,272-64 files

LLVM/project 792631cllvm/test/CodeGen/RISCV/rvv roundeven-vp.ll nearbyint-vp.ll

[RISCV] Use NoV0 register classes for masked `VPseudoBinaryM` (#175706)

There are two constraints:

1. The same register can't have two EEWs. `V0` is already the mask
register, so other register source operands can't be `V0`.
2. The destination and source registers can't overlap. We have added
`@earlyclobber` constraint so we won' allocate `V0` to destination.
DeltaFile
+208-208llvm/test/CodeGen/RISCV/rvv/roundeven-vp.ll
+208-208llvm/test/CodeGen/RISCV/rvv/nearbyint-vp.ll
+208-208llvm/test/CodeGen/RISCV/rvv/floor-vp.ll
+208-208llvm/test/CodeGen/RISCV/rvv/rint-vp.ll
+208-208llvm/test/CodeGen/RISCV/rvv/round-vp.ll
+208-208llvm/test/CodeGen/RISCV/rvv/roundtozero-vp.ll
+1,248-1,24828 files not shown
+3,006-3,33634 files

LLVM/project 8c2e862bolt/include/bolt/Core MCPlusBuilder.h, bolt/lib/Passes LongJmp.cpp

[BOLT][BTI] Patch LLD-generated PLTs to contain BTI landing pad

This patch adds the patchPLTEntryForBTI to enable patching PLT entries
generated by LLD.

Context:

To keep BTI consistent, targets of stubs inserted in LongJmp need to be
patched. As PLTs are not optimized and emitted by BOLT, this patch adds
a helper for patching them in the original location.

For PLTs generated by LLD, this is safe as LLD inserts extra nops to
PLTs which don't already contain a BTI.

PLT entry before patching:

   adrp x16, Page(&(.got.plt[n]))
   ldr  x17, [x16, Offset(&(.got.plt[n]))]
   add  x16, x16, Offset(&(.got.plt[n]))

    [24 lines not shown]
DeltaFile
+61-0bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+33-4bolt/test/runtime/AArch64/long-jmp-bti-plt.c
+5-0bolt/lib/Passes/LongJmp.cpp
+4-0bolt/include/bolt/Core/MCPlusBuilder.h
+103-44 files

LLVM/project 316e46dbolt/lib/Target/AArch64 AArch64MCPlusBuilder.cpp

Add comment
DeltaFile
+3-0bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+3-01 files

LLVM/project 9eb8ff5bolt/lib/Target/AArch64 AArch64MCPlusBuilder.cpp

Apply suggestions from code review

Co-authored-by: Paschalis Mpeis <paschalis.mpeis at arm.com>
DeltaFile
+2-4bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+2-41 files

LLVM/project 583f991bolt/lib/Target/AArch64 AArch64MCPlusBuilder.cpp

Extra comment
DeltaFile
+2-0bolt/lib/Target/AArch64/AArch64MCPlusBuilder.cpp
+2-01 files

LLVM/project 4193c40bolt/lib/Core BinaryFunction.cpp, bolt/lib/Rewrite RewriteInstance.cpp

[BOLT][BTI] Disassemble PLT entries when processing BTI binaries (#169663)

PLT entries are PseudoFunctions, and are not disassembled or emitted.
For BTI, we need to check the first MCInst of PLT entries, to see
if indirectly calling them is safe or not.

This patch disassembles PLTs for binaries using BTI, while not changing
the behaviour for binaries without BTI.

The PLTs are only disassembled, not emitted.

---------

Co-authored-by: Paschalis Mpeis <paschalis.mpeis at arm.com>
DeltaFile
+31-0bolt/test/runtime/AArch64/disassemble-plts.c
+7-0bolt/lib/Rewrite/RewriteInstance.cpp
+5-0bolt/lib/Core/BinaryFunction.cpp
+43-03 files

LLVM/project 038f9f4flang/lib/Lower/OpenMP OpenMP.cpp

fix adding numThreadsNumDims to ParallelOperands apply method
DeltaFile
+1-0flang/lib/Lower/OpenMP/OpenMP.cpp
+1-01 files

LLVM/project f07a41aflang/lib/Lower/OpenMP OpenMP.cpp, mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td

Use num_threads_dims_values only
DeltaFile
+26-36mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+9-7mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+7-8mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+8-7flang/lib/Lower/OpenMP/OpenMP.cpp
+6-6mlir/test/Dialect/OpenMP/invalid.mlir
+5-5mlir/test/Dialect/OpenMP/ops.mlir
+61-692 files not shown
+66-738 files

LLVM/project 6028858mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td, mlir/lib/Conversion/SCFToOpenMP SCFToOpenMP.cpp

few more fixes
DeltaFile
+21-23mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+14-19mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+5-5mlir/test/Dialect/OpenMP/invalid.mlir
+3-6mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+2-2mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+45-555 files

LLVM/project 97045e6mlir/lib/Target/LLVMIR/Dialect/OpenMP OpenMPToLLVMIRTranslation.cpp

Mark mlir->llvmir translation for num_threads with dims as NYI
DeltaFile
+14-1mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+14-11 files

LLVM/project 6093bdcmlir/include/mlir/Dialect/OpenMP OpenMPClauses.td, mlir/lib/Conversion/SCFToOpenMP SCFToOpenMP.cpp

[OpenMP][MLIR] Add num_threads clause with dims modifier support
DeltaFile
+72-7mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+47-3mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+32-1mlir/test/Dialect/OpenMP/invalid.mlir
+10-5mlir/test/Dialect/OpenMP/ops.mlir
+2-0mlir/lib/Conversion/SCFToOpenMP/SCFToOpenMP.cpp
+163-165 files

LLVM/project f726624flang/lib/Lower/OpenMP OpenMP.cpp, mlir/include/mlir/Dialect/OpenMP OpenMPClauses.td

remove dims(N) syntax and just use list for dims vals
DeltaFile
+0-133mlir/test/Dialect/OpenMP/invalid.mlir
+14-42mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+22-26mlir/include/mlir/Dialect/OpenMP/OpenMPClauses.td
+17-7mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+6-9flang/lib/Lower/OpenMP/OpenMP.cpp
+11-4mlir/test/Dialect/OpenMP/ops.mlir
+70-2214 files not shown
+87-23110 files

LLVM/project 5a06a67lld/ELF InputSection.cpp

Address maskray's comments
DeltaFile
+8-7lld/ELF/InputSection.cpp
+8-71 files

LLVM/project 3bf7c07lld/ELF/Arch LoongArch.cpp

Fix a typo
DeltaFile
+1-1lld/ELF/Arch/LoongArch.cpp
+1-11 files