LLVM/project 1122581flang/test/Lower derived-types-bindc.f90

[flang][AIX] add use of the variables (NFC) (#168073)

After
https://github.com/llvm/llvm-project/commit/bf3b704c60cc521b79ec54bd57fcf72368178a52,
the type definition is no longer generated without using the variables.
This patch is to add the use of the derived type variables.
DeltaFile
+6-0flang/test/Lower/derived-types-bindc.f90
+6-01 files

LLVM/project 9398eafllvm/lib/Target/AMDGPU SIInstrInfo.cpp, llvm/test/CodeGen/AMDGPU mfma-loop.ll waterfall-call-target-av-register-failure.ll

AMDGPU: Fix verifier error when waterfall call target is in AV register

This isn't an ideal fix; technically this should be an optimization path
we shouldn't need to go down. The base path where a copy will be inserted
is still broken.

The lit test changes are mostly regressions to be fixed later.
DeltaFile
+862-668llvm/test/CodeGen/AMDGPU/mfma-loop.ll
+141-0llvm/test/CodeGen/AMDGPU/waterfall-call-target-av-register-failure.ll
+26-18llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+16-12llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll
+8-9llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
+1-1llvm/test/CodeGen/AMDGPU/no-fold-accvgpr-mov.ll
+1,054-7086 files

LLVM/project ab51862llvm/lib/Target/AMDGPU SIFixSGPRCopies.cpp, llvm/test/CodeGen/AMDGPU si-fix-sgpr-copies-av-constrain.mir fix-sgpr-copies-readfirstlane-av-register-regression.ll

AMDGPU: Constrain readfirstlane operand when writing to m0

Fixes another verifier error after introducing AV registers.
Also fixes not clearing the subregister index if there was
one.
DeltaFile
+18-4llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
+19-0llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
+16-1llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
+53-53 files

LLVM/project c6ee2d9llvm/lib/Target/AMDGPU SIFixSGPRCopies.cpp, llvm/test/CodeGen/AMDGPU si-fix-sgpr-copies-av-constrain.mir fix-sgpr-copies-readfirstlane-av-register-regression.ll

AMDGPU: Constrain readfirstlane operand to vgpr_32 (#168001)

DeltaFile
+92-0llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
+52-0llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
+14-3llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
+158-33 files

LLVM/project 71eaf14llvm/include/llvm/TableGen TableGenBackend.h Main.h, llvm/lib/TableGen Main.cpp TableGenBackend.cpp

[TableGen] Split *GenRegisterInfo.inc. (#167700)

Reduces memory usage compiling backend sources, most notably for
AMDGPU by ~98 MB per source on average.

AMDGPUGenRegisterInfo.inc is tens of megabytes in size now, and
is even larger downstream. At the same time, it is included in
nearly all backend sources, typically just for a small portion of
its content, resulting in compilation being unnecessarily
memory-hungry, which in turn stresses buildbots and wastes their
resources.

Splitting .inc files also helps avoiding extra ccache misses
where changes in .td files don't cause changes in all parts of
what previously was a single .inc file.

It is thought that rather than building on top of the current
single-output-file design of TableGen, e.g., using `split-file`,
it would be more preferable to recognise the need for multi-file
outputs and give it a proper first-class support directly in
TableGen.
DeltaFile
+55-31llvm/utils/TableGen/RegisterInfoEmitter.cpp
+52-24llvm/lib/TableGen/Main.cpp
+40-3llvm/include/llvm/TableGen/TableGenBackend.h
+14-5llvm/lib/TableGen/TableGenBackend.cpp
+16-1llvm/include/llvm/TableGen/Main.h
+8-1mlir/lib/Tools/mlir-tblgen/MlirTblgenMain.cpp
+185-658 files not shown
+201-7314 files

LLVM/project e170fb5clang/lib/Sema SemaOpenMP.cpp, clang/test/OpenMP parallel_default_variableCategory_codegen.cpp

Revert "[Clang][OpenMP] Bug fix Default clause variable category (#165276)"

This reverts commit 39774f9cafeb8d68acae73c1bf8493343732ebdd.
DeltaFile
+0-92clang/test/OpenMP/parallel_default_variableCategory_codegen.cpp
+3-3clang/lib/Sema/SemaOpenMP.cpp
+3-952 files

LLVM/project 0e1152ellvm/lib/Target/AArch64 AArch64RegisterInfo.cpp AArch64RegisterInfo.h

AArch64: rewrite the CSR compuation (#167967)

Rather than having a separate path for Darwin, and then a partial
handling for Windows, and then the remainder using its own path, unify
the three paths. Use a switch over the calling convention to avoid
having to check and handle the calling convention in a variety of
places. This simplifies the logic and avoids accidnetally missing a
calling convention (such as we had done with PreserveMost, PreserveAll
on Windows).
DeltaFile
+86-108llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+0-1llvm/lib/Target/AArch64/AArch64RegisterInfo.h
+86-1092 files

LLVM/project e06fabclldb/include/lldb/Core Disassembler.h, lldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Simplify instruction iteration in UnwindAssemblyInstEmulation (#167914)
DeltaFile
+7-9lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+4-0lldb/include/lldb/Core/Disassembler.h
+11-92 files

LLVM/project f26f27clldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp UnwindAssemblyInstEmulation.h

[lldb][nfc] Initialize m_initial_sp in ctor for UnwindAssemblyInstEmulation (#167914)

Also rename the "sp" suffix (originally intended to mean "Stack
Pointer") to "cfa", as "sp" generally means Shared Pointer.
DeltaFile
+6-12lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+6-2lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.h
+12-142 files

LLVM/project 1f93400lldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Reduce indentation in UnwindAssemblyInstruction (#167914)
DeltaFile
+107-100lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+107-1001 files

LLVM/project 0b5543alldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Fix comment in UnwindAssemblyInstruction (#167914)
DeltaFile
+1-1lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+1-11 files

LLVM/project 81a73dclldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Reduce scope of loop variable in UnwindAssemblyInstEmulation (#167914)
DeltaFile
+3-3lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+3-31 files

LLVM/project b27681flldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Add helper function for logging in UnwindAssemblyInstruction (#167914)
DeltaFile
+16-13lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+16-131 files

LLVM/project a284ce8flang-rt/include/flang-rt/runtime iostat.h format-implementation.h, flang-rt/lib/runtime iostat.cpp

[flang][runtime] Advance output record in specific case (#167786)

When a formatted WRITE takes place in a defined output subroutine called
from a context in which record advancement is allowed, such as NAMELIST,
the char-string-edit-descs in the format can trigger record advancement.

Also clean up confusing messiness lingering from the separation of
iostat.h two headers in flang/.../Runtime. iostat.h didn't need to be
put into flang/.../Runtime since it's included only by flang-rt, and
iostat-consts.h doesn't need one of its includes.

Fixes https://github.com/llvm/llvm-project/issues/167757.
DeltaFile
+23-0flang-rt/include/flang-rt/runtime/iostat.h
+0-23flang/include/flang/Runtime/iostat.h
+9-0flang-rt/include/flang-rt/runtime/format-implementation.h
+8-0flang-rt/include/flang-rt/runtime/connection.h
+1-1flang-rt/lib/runtime/iostat.cpp
+1-1flang-rt/include/flang-rt/runtime/io-error.h
+42-251 files not shown
+42-267 files

LLVM/project 3425f22flang/lib/Evaluate intrinsics.cpp, flang/test/Semantics c_f_pointer.f90

[flang] Disable some warnings with ineluctable false positives (#167714)

There are a few well-meaning warnings for some cases of the FPTR=
argument to C_F_POINTER() that can be false positives, since the
restrictions in the standard are dependent on the source of the CPTR=
argument. Further, there is no way to alter a program to avoid these
warnings, so one cannot compile a correct and conforming program with
-pedantic -Werror. Disable these warnings.

Fixes https://github.com/llvm/llvm-project/issues/167470.
DeltaFile
+0-22flang/lib/Evaluate/intrinsics.cpp
+8-4flang/test/Semantics/c_f_pointer.f90
+8-262 files

LLVM/project 9c3955aflang/include/flang/Semantics type.h, flang/lib/Semantics expression.cpp type.cpp

[flang] Use instantiated PDT for structure constructor in default init (#167409)

A structure constructor used in (or as) the default component
initializer for a PDT derived type component needs to traverse the scope
of the right PDT instantiation.

Fixes https://github.com/llvm/llvm-project/issues/167337 and fixes
https://github.com/llvm/llvm-project/issues/167573.
DeltaFile
+21-4flang/lib/Semantics/expression.cpp
+11-10flang/test/Semantics/structconst12.f90
+7-0flang/lib/Semantics/type.cpp
+1-0flang/include/flang/Semantics/type.h
+40-144 files

LLVM/project b67e465llvm/lib/Target/AMDGPU SIShrinkInstructions.cpp, llvm/test/CodeGen/AMDGPU s_cmp_0.ll

[AMDGPU] Ensure SCC is not live before shrinking to s_bitset* (#167907)

Ensure SCC is not live before shrinking s_and*/s_or* instructions to
s_bitset*.

---------

Signed-off-by: John Lu <John.Lu at amd.com>
DeltaFile
+42-5llvm/test/CodeGen/AMDGPU/s_cmp_0.ll
+4-2llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+46-72 files

LLVM/project f2336d4llvm/test/Transforms/LoopVectorize uniform_across_vf_induction2.ll induction.ll, llvm/test/Transforms/LoopVectorize/RISCV tail-folding-interleave.ll tail-folding-cond-reduction.ll

Revert "[VPlan] Expand WidenInt inductions with nuw/nsw" (#168080)

Reverts llvm/llvm-project#163538

This is causing build failures on the two-stage RVV buildbots. e.g.
https://lab.llvm.org/buildbot/#/builders/214/builds/1363. I've shared a
reproducer and more information at
https://github.com/llvm/llvm-project/pull/163538#issuecomment-3533482822

This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
DeltaFile
+70-70llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction2.ll
+62-62llvm/test/Transforms/LoopVectorize/induction.ll
+48-68llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-constant-size.ll
+42-42llvm/test/Transforms/LoopVectorize/no_outside_user.ll
+28-28llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-interleave.ll
+24-24llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-cond-reduction.ll
+274-294113 files not shown
+793-864119 files

LLVM/project e6f868cllvm/lib/Target/Sparc SparcInstrInfo.cpp SparcInstrInfo.h, llvm/test/CodeGen/SPARC optcompare.ll umulo-128-legalisation-lowering.ll

[Sparc] Optimize compare instruction (#167140)

If we need to compare the result of a computation with 0, we can
sometimes replace the last instruction in the computation with one that
sets the integer condition codes. We can then branch immediately based
on the zero-flag instead of having to use an extra compare instruction
(a SUBcc instruction).

This is only possible if the result of the compare is not used anywhere
else and that no other instruction modifies the integer condition codes
between the time the result of the computation is defined and the time
it is used.

---------

Co-authored-by: Daniel Cederman <cederman at gaisler.com>
DeltaFile
+248-0llvm/test/CodeGen/SPARC/optcompare.ll
+139-0llvm/lib/Target/Sparc/SparcInstrInfo.cpp
+7-9llvm/test/CodeGen/SPARC/umulo-128-legalisation-lowering.ll
+8-0llvm/lib/Target/Sparc/SparcInstrInfo.h
+2-4llvm/test/CodeGen/SPARC/ctlz.ll
+2-4llvm/test/CodeGen/SPARC/atomicrmw-uinc-udec-wrap.ll
+406-174 files not shown
+411-2610 files

LLVM/project c30f93bllvm/test/Transforms/LoopVectorize uniform_across_vf_induction2.ll induction.ll, llvm/test/Transforms/LoopVectorize/RISCV tail-folding-interleave.ll tail-folding-cond-reduction.ll

Revert "[VPlan] Expand WidenInt inductions with nuw/nsw (#163538)"

This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
DeltaFile
+70-70llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction2.ll
+62-62llvm/test/Transforms/LoopVectorize/induction.ll
+48-68llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-constant-size.ll
+42-42llvm/test/Transforms/LoopVectorize/no_outside_user.ll
+28-28llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-interleave.ll
+24-24llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-cond-reduction.ll
+274-294113 files not shown
+793-864119 files

LLVM/project 39774f9clang/lib/Sema SemaOpenMP.cpp, clang/test/OpenMP parallel_default_variableCategory_codegen.cpp

[Clang][OpenMP] Bug fix Default clause variable category (#165276)

In the default clause taking care of new comments in the previous
"Support for Default clause variable category"
[157063](https://github.com/llvm/llvm-project/pull/157063) and adding a
new test case.

---------

Co-authored-by: Sunil Kuravinakop <kuravina at pe31.hpc.amslabs.hpecorp.net>
DeltaFile
+92-0clang/test/OpenMP/parallel_default_variableCategory_codegen.cpp
+3-3clang/lib/Sema/SemaOpenMP.cpp
+95-32 files

LLVM/project 8aa7d82flang/lib/Lower/OpenMP Utils.cpp ClauseProcessor.cpp, flang/lib/Optimizer/OpenMP MapInfoFinalization.cpp

[OpenMP][Flang] Emit default declare mappers implicitly for derived types (#140562)

This patch adds support to emit default declare mappers for implicit
mapping of derived types when not supplied by user. This especially
helps tackle mapping of allocatables of derived types.
DeltaFile
+148-0flang/lib/Lower/OpenMP/Utils.cpp
+83-25flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+65-0offload/test/offloading/fortran/implicit-derived-enter-exit.f90
+38-12flang/lib/Lower/OpenMP/OpenMP.cpp
+21-0flang/test/Lower/OpenMP/derived-type-map.f90
+11-10flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+366-473 files not shown
+376-499 files

LLVM/project 10f16a8llvm/lib/Transforms/Scalar MemCpyOptimizer.cpp, llvm/test/Transforms/MemCpyOpt memset-memcpy-dbgloc.ll

[MemCpyOpt][profcheck] Set `unknown` branch weights for certain selects
DeltaFile
+5-3llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
+5-0llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+10-32 files

LLVM/project 22d3e3bllvm/include/llvm/IR ProfDataUtils.h, llvm/lib/IR ProfDataUtils.cpp

[MergeICmp][profcheck] Propagate profile info
DeltaFile
+41-3llvm/lib/Transforms/Scalar/MergeICmps.cpp
+25-12llvm/test/Transforms/MergeICmps/X86/alias-merge-blocks.ll
+20-7llvm/test/Transforms/MergeICmps/X86/entry-block-shuffled.ll
+3-0llvm/include/llvm/IR/ProfDataUtils.h
+1-1llvm/lib/IR/ProfDataUtils.cpp
+90-235 files

LLVM/project 282bdb4llvm/lib/Target/AArch64 AArch64ISelLowering.cpp

[AArch64] Use isAllOnes rather than popcount() == Size (NFC) (#167884)

DeltaFile
+1-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+1-11 files

LLVM/project 1a8d0callvm/lib/Target/AMDGPU GCNRegPressure.cpp, llvm/test/CodeGen/AMDGPU machine-scheduler-sink-trivial-remats-attr.mir machine-scheduler-sink-trivial-remats.mir

[AMDGPU] Rematerialize VGPR candidates when SGPR spills to VGPR over the VGPR limit

Before, when selecting candidates to rematerialize, we would only
consider SGPR candidates when there was an excess of SGPR registers.

Failing to eliminate the excess would result in spills to VGPRs.
This is normally not an issue, unless spilling to VGPRs results in
excess VGPRs.

This patch does 2 things:
* It relaxes the GCNRPTarget success criteria: now we accept regions
  where we spill SGPRs to VGPRs, as long as this does not end up in
  excess VGPRs.
* It changes isSaveBeneficial to consider the excess VGPRs (which
  includes the SGPRs that would be spilled to VGPR).

With these changes, the compiler rematerializes VGPRs when the excess
SGPRs would result in VGPR excess.


    [4 lines not shown]
DeltaFile
+215-215llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats-attr.mir
+92-92llvm/test/CodeGen/AMDGPU/machine-scheduler-sink-trivial-remats.mir
+11-13llvm/lib/Target/AMDGPU/GCNRegPressure.cpp
+1-1llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+319-3214 files

LLVM/project cb22154llvm/test/CodeGen/AMDGPU swdev-549940.ll

Unacceptably large test
DeltaFile
+266-0llvm/test/CodeGen/AMDGPU/swdev-549940.ll
+266-01 files

LLVM/project 0d11b1dflang/lib/Lower/OpenMP OpenMP.cpp, flang/lib/Semantics resolve-directives.cpp check-omp-loop.cpp

[flang][OpenMP] Store Block in OpenMPLoopConstruct, add access functions

Instead of storing a variant with specific types, store parser::Block
as the body. Add two access functions to make the traversal of the nest
simpler.

This will allow storing loop-nest sequences in the future.
DeltaFile
+47-83flang/lib/Semantics/resolve-directives.cpp
+37-59flang/lib/Semantics/check-omp-loop.cpp
+44-41flang/test/Parser/OpenMP/loop-transformation-construct02.f90
+36-35flang/test/Parser/OpenMP/loop-transformation-construct03.f90
+30-28flang/test/Parser/OpenMP/loop-transformation-construct01.f90
+16-21flang/lib/Lower/OpenMP/OpenMP.cpp
+210-26713 files not shown
+285-33919 files

LLVM/project 7de9865llvm/lib/Target/AArch64 AArch64InstrInfo.td AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 int-to-fp-no-neon.ll itofp.ll

[AArch64] Extend int-to-fp load optimization to support f16
DeltaFile
+33-15llvm/test/CodeGen/AArch64/int-to-fp-no-neon.ll
+24-12llvm/test/CodeGen/AArch64/itofp.ll
+13-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+3-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+73-284 files

LLVM/project 94c751dclang/lib/AST/ByteCode Context.h

[clang][bytecode][NFC] Check pointer types in canClassify() (#168069)

And return true. Also make those two functions const.
DeltaFile
+4-2clang/lib/AST/ByteCode/Context.h
+4-21 files