LLVM/project 48c758dllvm/lib/Transforms/Scalar MemCpyOptimizer.cpp, llvm/test/Transforms/MemCpyOpt memset-memcpy-dbgloc.ll

[MemCpyOpt][profcheck] Set `unknown` branch weights for certain selects
DeltaFile
+5-3llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
+5-0llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+10-32 files

LLVM/project 0a5be0fllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 minbw-node-used-twice.ll non-schedulable-parent-multi-copyables.ll

[SLP]Enable Sub as a base instruction in copyables

Patch adds support for sub instructions as main instruction in copyables
elements. Also, adds a check if the base instruction is not profitable
for the selection if at least one instruction with the main opcode is
  used as an immediate operand.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/163231
DeltaFile
+87-27llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+2-9llvm/test/Transforms/SLPVectorizer/X86/minbw-node-used-twice.ll
+6-1llvm/test/Transforms/SLPVectorizer/X86/non-schedulable-parent-multi-copyables.ll
+1-1llvm/test/Transforms/SLPVectorizer/X86/vect_copyable_in_binops.ll
+96-384 files

LLVM/project 2bcb3f8.github/workflows libcxx-build-and-test.yaml

[libcxx][Github] Move from next runner set (#168089)

This will allow us to actually bump the runner set.
DeltaFile
+8-8.github/workflows/libcxx-build-and-test.yaml
+8-81 files

LLVM/project 9ac84a6llvm/include/llvm/IR ProfDataUtils.h, llvm/lib/IR ProfDataUtils.cpp

[MergeICmp][profcheck] Propagate profile info (#167594)

Propagate branch weights in `mergeComparisons`​ : the probability of reaching the common "exit" BB (`bb_phi`​ in the description in `processPhi`​)doesn't change, and is a disjunction over the probabilities of doing that from the blocks performing comparisons which are now being merged  
  
Issue #147390
DeltaFile
+41-3llvm/lib/Transforms/Scalar/MergeICmps.cpp
+25-12llvm/test/Transforms/MergeICmps/X86/alias-merge-blocks.ll
+20-7llvm/test/Transforms/MergeICmps/X86/entry-block-shuffled.ll
+3-0llvm/include/llvm/IR/ProfDataUtils.h
+1-1llvm/lib/IR/ProfDataUtils.cpp
+90-235 files

LLVM/project cfc74ddllvm/lib/Target/AMDGPU SIFixSGPRCopies.cpp, llvm/test/CodeGen/AMDGPU si-fix-sgpr-copies-av-constrain.mir fix-sgpr-copies-readfirstlane-av-register-regression.ll

AMDGPU: Constrain readfirstlane operand when writing to m0 (#168004)

Fixes another verifier error after introducing AV registers.
Also fixes not clearing the subregister index if there was
one.
DeltaFile
+18-4llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
+19-0llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
+16-1llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
+53-53 files

LLVM/project 1122581flang/test/Lower derived-types-bindc.f90

[flang][AIX] add use of the variables (NFC) (#168073)

After
https://github.com/llvm/llvm-project/commit/bf3b704c60cc521b79ec54bd57fcf72368178a52,
the type definition is no longer generated without using the variables.
This patch is to add the use of the derived type variables.
DeltaFile
+6-0flang/test/Lower/derived-types-bindc.f90
+6-01 files

LLVM/project 9398eafllvm/lib/Target/AMDGPU SIInstrInfo.cpp, llvm/test/CodeGen/AMDGPU mfma-loop.ll waterfall-call-target-av-register-failure.ll

AMDGPU: Fix verifier error when waterfall call target is in AV register

This isn't an ideal fix; technically this should be an optimization path
we shouldn't need to go down. The base path where a copy will be inserted
is still broken.

The lit test changes are mostly regressions to be fixed later.
DeltaFile
+862-668llvm/test/CodeGen/AMDGPU/mfma-loop.ll
+141-0llvm/test/CodeGen/AMDGPU/waterfall-call-target-av-register-failure.ll
+26-18llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+16-12llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll
+8-9llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
+1-1llvm/test/CodeGen/AMDGPU/no-fold-accvgpr-mov.ll
+1,054-7086 files

LLVM/project ab51862llvm/lib/Target/AMDGPU SIFixSGPRCopies.cpp, llvm/test/CodeGen/AMDGPU si-fix-sgpr-copies-av-constrain.mir fix-sgpr-copies-readfirstlane-av-register-regression.ll

AMDGPU: Constrain readfirstlane operand when writing to m0

Fixes another verifier error after introducing AV registers.
Also fixes not clearing the subregister index if there was
one.
DeltaFile
+18-4llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
+19-0llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
+16-1llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
+53-53 files

LLVM/project c6ee2d9llvm/lib/Target/AMDGPU SIFixSGPRCopies.cpp, llvm/test/CodeGen/AMDGPU si-fix-sgpr-copies-av-constrain.mir fix-sgpr-copies-readfirstlane-av-register-regression.ll

AMDGPU: Constrain readfirstlane operand to vgpr_32 (#168001)

DeltaFile
+92-0llvm/test/CodeGen/AMDGPU/si-fix-sgpr-copies-av-constrain.mir
+52-0llvm/test/CodeGen/AMDGPU/fix-sgpr-copies-readfirstlane-av-register-regression.ll
+14-3llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp
+158-33 files

LLVM/project 71eaf14llvm/include/llvm/TableGen TableGenBackend.h Main.h, llvm/lib/TableGen Main.cpp TableGenBackend.cpp

[TableGen] Split *GenRegisterInfo.inc. (#167700)

Reduces memory usage compiling backend sources, most notably for
AMDGPU by ~98 MB per source on average.

AMDGPUGenRegisterInfo.inc is tens of megabytes in size now, and
is even larger downstream. At the same time, it is included in
nearly all backend sources, typically just for a small portion of
its content, resulting in compilation being unnecessarily
memory-hungry, which in turn stresses buildbots and wastes their
resources.

Splitting .inc files also helps avoiding extra ccache misses
where changes in .td files don't cause changes in all parts of
what previously was a single .inc file.

It is thought that rather than building on top of the current
single-output-file design of TableGen, e.g., using `split-file`,
it would be more preferable to recognise the need for multi-file
outputs and give it a proper first-class support directly in
TableGen.
DeltaFile
+55-31llvm/utils/TableGen/RegisterInfoEmitter.cpp
+52-24llvm/lib/TableGen/Main.cpp
+40-3llvm/include/llvm/TableGen/TableGenBackend.h
+14-5llvm/lib/TableGen/TableGenBackend.cpp
+16-1llvm/include/llvm/TableGen/Main.h
+8-1mlir/lib/Tools/mlir-tblgen/MlirTblgenMain.cpp
+185-658 files not shown
+201-7314 files

LLVM/project e170fb5clang/lib/Sema SemaOpenMP.cpp, clang/test/OpenMP parallel_default_variableCategory_codegen.cpp

Revert "[Clang][OpenMP] Bug fix Default clause variable category (#165276)"

This reverts commit 39774f9cafeb8d68acae73c1bf8493343732ebdd.
DeltaFile
+0-92clang/test/OpenMP/parallel_default_variableCategory_codegen.cpp
+3-3clang/lib/Sema/SemaOpenMP.cpp
+3-952 files

LLVM/project 0e1152ellvm/lib/Target/AArch64 AArch64RegisterInfo.cpp AArch64RegisterInfo.h

AArch64: rewrite the CSR compuation (#167967)

Rather than having a separate path for Darwin, and then a partial
handling for Windows, and then the remainder using its own path, unify
the three paths. Use a switch over the calling convention to avoid
having to check and handle the calling convention in a variety of
places. This simplifies the logic and avoids accidnetally missing a
calling convention (such as we had done with PreserveMost, PreserveAll
on Windows).
DeltaFile
+86-108llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+0-1llvm/lib/Target/AArch64/AArch64RegisterInfo.h
+86-1092 files

LLVM/project e06fabclldb/include/lldb/Core Disassembler.h, lldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Simplify instruction iteration in UnwindAssemblyInstEmulation (#167914)
DeltaFile
+7-9lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+4-0lldb/include/lldb/Core/Disassembler.h
+11-92 files

LLVM/project f26f27clldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp UnwindAssemblyInstEmulation.h

[lldb][nfc] Initialize m_initial_sp in ctor for UnwindAssemblyInstEmulation (#167914)

Also rename the "sp" suffix (originally intended to mean "Stack
Pointer") to "cfa", as "sp" generally means Shared Pointer.
DeltaFile
+6-12lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+6-2lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.h
+12-142 files

LLVM/project 1f93400lldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Reduce indentation in UnwindAssemblyInstruction (#167914)
DeltaFile
+107-100lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+107-1001 files

LLVM/project 0b5543alldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Fix comment in UnwindAssemblyInstruction (#167914)
DeltaFile
+1-1lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+1-11 files

LLVM/project 81a73dclldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Reduce scope of loop variable in UnwindAssemblyInstEmulation (#167914)
DeltaFile
+3-3lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+3-31 files

LLVM/project b27681flldb/source/Plugins/UnwindAssembly/InstEmulation UnwindAssemblyInstEmulation.cpp

[lldb][nfc] Add helper function for logging in UnwindAssemblyInstruction (#167914)
DeltaFile
+16-13lldb/source/Plugins/UnwindAssembly/InstEmulation/UnwindAssemblyInstEmulation.cpp
+16-131 files

LLVM/project a284ce8flang-rt/include/flang-rt/runtime iostat.h format-implementation.h, flang-rt/lib/runtime iostat.cpp

[flang][runtime] Advance output record in specific case (#167786)

When a formatted WRITE takes place in a defined output subroutine called
from a context in which record advancement is allowed, such as NAMELIST,
the char-string-edit-descs in the format can trigger record advancement.

Also clean up confusing messiness lingering from the separation of
iostat.h two headers in flang/.../Runtime. iostat.h didn't need to be
put into flang/.../Runtime since it's included only by flang-rt, and
iostat-consts.h doesn't need one of its includes.

Fixes https://github.com/llvm/llvm-project/issues/167757.
DeltaFile
+23-0flang-rt/include/flang-rt/runtime/iostat.h
+0-23flang/include/flang/Runtime/iostat.h
+9-0flang-rt/include/flang-rt/runtime/format-implementation.h
+8-0flang-rt/include/flang-rt/runtime/connection.h
+1-1flang-rt/lib/runtime/iostat.cpp
+1-1flang-rt/include/flang-rt/runtime/io-error.h
+42-251 files not shown
+42-267 files

LLVM/project 3425f22flang/lib/Evaluate intrinsics.cpp, flang/test/Semantics c_f_pointer.f90

[flang] Disable some warnings with ineluctable false positives (#167714)

There are a few well-meaning warnings for some cases of the FPTR=
argument to C_F_POINTER() that can be false positives, since the
restrictions in the standard are dependent on the source of the CPTR=
argument. Further, there is no way to alter a program to avoid these
warnings, so one cannot compile a correct and conforming program with
-pedantic -Werror. Disable these warnings.

Fixes https://github.com/llvm/llvm-project/issues/167470.
DeltaFile
+0-22flang/lib/Evaluate/intrinsics.cpp
+8-4flang/test/Semantics/c_f_pointer.f90
+8-262 files

LLVM/project 9c3955aflang/include/flang/Semantics type.h, flang/lib/Semantics expression.cpp type.cpp

[flang] Use instantiated PDT for structure constructor in default init (#167409)

A structure constructor used in (or as) the default component
initializer for a PDT derived type component needs to traverse the scope
of the right PDT instantiation.

Fixes https://github.com/llvm/llvm-project/issues/167337 and fixes
https://github.com/llvm/llvm-project/issues/167573.
DeltaFile
+21-4flang/lib/Semantics/expression.cpp
+11-10flang/test/Semantics/structconst12.f90
+7-0flang/lib/Semantics/type.cpp
+1-0flang/include/flang/Semantics/type.h
+40-144 files

LLVM/project b67e465llvm/lib/Target/AMDGPU SIShrinkInstructions.cpp, llvm/test/CodeGen/AMDGPU s_cmp_0.ll

[AMDGPU] Ensure SCC is not live before shrinking to s_bitset* (#167907)

Ensure SCC is not live before shrinking s_and*/s_or* instructions to
s_bitset*.

---------

Signed-off-by: John Lu <John.Lu at amd.com>
DeltaFile
+42-5llvm/test/CodeGen/AMDGPU/s_cmp_0.ll
+4-2llvm/lib/Target/AMDGPU/SIShrinkInstructions.cpp
+46-72 files

LLVM/project f2336d4llvm/test/Transforms/LoopVectorize uniform_across_vf_induction2.ll induction.ll, llvm/test/Transforms/LoopVectorize/RISCV tail-folding-interleave.ll tail-folding-cond-reduction.ll

Revert "[VPlan] Expand WidenInt inductions with nuw/nsw" (#168080)

Reverts llvm/llvm-project#163538

This is causing build failures on the two-stage RVV buildbots. e.g.
https://lab.llvm.org/buildbot/#/builders/214/builds/1363. I've shared a
reproducer and more information at
https://github.com/llvm/llvm-project/pull/163538#issuecomment-3533482822

This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
DeltaFile
+70-70llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction2.ll
+62-62llvm/test/Transforms/LoopVectorize/induction.ll
+48-68llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-constant-size.ll
+42-42llvm/test/Transforms/LoopVectorize/no_outside_user.ll
+28-28llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-interleave.ll
+24-24llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-cond-reduction.ll
+274-294113 files not shown
+793-864119 files

LLVM/project e6f868cllvm/lib/Target/Sparc SparcInstrInfo.cpp SparcInstrInfo.h, llvm/test/CodeGen/SPARC optcompare.ll umulo-128-legalisation-lowering.ll

[Sparc] Optimize compare instruction (#167140)

If we need to compare the result of a computation with 0, we can
sometimes replace the last instruction in the computation with one that
sets the integer condition codes. We can then branch immediately based
on the zero-flag instead of having to use an extra compare instruction
(a SUBcc instruction).

This is only possible if the result of the compare is not used anywhere
else and that no other instruction modifies the integer condition codes
between the time the result of the computation is defined and the time
it is used.

---------

Co-authored-by: Daniel Cederman <cederman at gaisler.com>
DeltaFile
+248-0llvm/test/CodeGen/SPARC/optcompare.ll
+139-0llvm/lib/Target/Sparc/SparcInstrInfo.cpp
+7-9llvm/test/CodeGen/SPARC/umulo-128-legalisation-lowering.ll
+8-0llvm/lib/Target/Sparc/SparcInstrInfo.h
+2-4llvm/test/CodeGen/SPARC/ctlz.ll
+2-4llvm/test/CodeGen/SPARC/atomicrmw-uinc-udec-wrap.ll
+406-174 files not shown
+411-2610 files

LLVM/project c30f93bllvm/test/Transforms/LoopVectorize uniform_across_vf_induction2.ll induction.ll, llvm/test/Transforms/LoopVectorize/RISCV tail-folding-interleave.ll tail-folding-cond-reduction.ll

Revert "[VPlan] Expand WidenInt inductions with nuw/nsw (#163538)"

This reverts commit 355e0f94af5adabe90ac57110ce1b47596afd4cd.
DeltaFile
+70-70llvm/test/Transforms/LoopVectorize/uniform_across_vf_induction2.ll
+62-62llvm/test/Transforms/LoopVectorize/induction.ll
+48-68llvm/test/Transforms/LoopVectorize/dereferenceable-info-from-assumption-constant-size.ll
+42-42llvm/test/Transforms/LoopVectorize/no_outside_user.ll
+28-28llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-interleave.ll
+24-24llvm/test/Transforms/LoopVectorize/RISCV/tail-folding-cond-reduction.ll
+274-294113 files not shown
+793-864119 files

LLVM/project 39774f9clang/lib/Sema SemaOpenMP.cpp, clang/test/OpenMP parallel_default_variableCategory_codegen.cpp

[Clang][OpenMP] Bug fix Default clause variable category (#165276)

In the default clause taking care of new comments in the previous
"Support for Default clause variable category"
[157063](https://github.com/llvm/llvm-project/pull/157063) and adding a
new test case.

---------

Co-authored-by: Sunil Kuravinakop <kuravina at pe31.hpc.amslabs.hpecorp.net>
DeltaFile
+92-0clang/test/OpenMP/parallel_default_variableCategory_codegen.cpp
+3-3clang/lib/Sema/SemaOpenMP.cpp
+95-32 files

LLVM/project 8aa7d82flang/lib/Lower/OpenMP Utils.cpp ClauseProcessor.cpp, flang/lib/Optimizer/OpenMP MapInfoFinalization.cpp

[OpenMP][Flang] Emit default declare mappers implicitly for derived types (#140562)

This patch adds support to emit default declare mappers for implicit
mapping of derived types when not supplied by user. This especially
helps tackle mapping of allocatables of derived types.
DeltaFile
+148-0flang/lib/Lower/OpenMP/Utils.cpp
+83-25flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+65-0offload/test/offloading/fortran/implicit-derived-enter-exit.f90
+38-12flang/lib/Lower/OpenMP/OpenMP.cpp
+21-0flang/test/Lower/OpenMP/derived-type-map.f90
+11-10flang/lib/Optimizer/OpenMP/MapInfoFinalization.cpp
+366-473 files not shown
+376-499 files

LLVM/project 10f16a8llvm/lib/Transforms/Scalar MemCpyOptimizer.cpp, llvm/test/Transforms/MemCpyOpt memset-memcpy-dbgloc.ll

[MemCpyOpt][profcheck] Set `unknown` branch weights for certain selects
DeltaFile
+5-3llvm/test/Transforms/MemCpyOpt/memset-memcpy-dbgloc.ll
+5-0llvm/lib/Transforms/Scalar/MemCpyOptimizer.cpp
+10-32 files

LLVM/project 22d3e3bllvm/include/llvm/IR ProfDataUtils.h, llvm/lib/IR ProfDataUtils.cpp

[MergeICmp][profcheck] Propagate profile info
DeltaFile
+41-3llvm/lib/Transforms/Scalar/MergeICmps.cpp
+25-12llvm/test/Transforms/MergeICmps/X86/alias-merge-blocks.ll
+20-7llvm/test/Transforms/MergeICmps/X86/entry-block-shuffled.ll
+3-0llvm/include/llvm/IR/ProfDataUtils.h
+1-1llvm/lib/IR/ProfDataUtils.cpp
+90-235 files

LLVM/project 282bdb4llvm/lib/Target/AArch64 AArch64ISelLowering.cpp

[AArch64] Use isAllOnes rather than popcount() == Size (NFC) (#167884)

DeltaFile
+1-1llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+1-11 files