LLVM/project 124fa5cllvm/lib/Target/AArch64 AArch64TargetTransformInfo.cpp, llvm/test/Analysis/CostModel/AArch64 shuffle-other.ll

[AArch64] - Improve costing for Identity shuffles for SVE targets. (#165375)

Identity masks can be treated as free when scalable vectorization is
possible making the check agnostic of the vectorization policy
fixed/scalable, This allows for aggressive vector combines for identity
shuffle masks.
DeltaFile
+61-0llvm/test/Transforms/VectorCombine/AArch64/identity-shuffle-sve.ll
+9-8llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+12-0llvm/test/Analysis/CostModel/AArch64/shuffle-other.ll
+82-83 files

LLVM/project 164c72fmlir/lib/Conversion/XeGPUToXeVM XeGPUToXeVM.cpp, mlir/test/Conversion/XeGPUToXeVM loadstore_matrix.mlir

using implict type converter for memref input
DeltaFile
+3-10mlir/lib/Conversion/XeGPUToXeVM/XeGPUToXeVM.cpp
+2-3mlir/test/Conversion/XeGPUToXeVM/loadstore_matrix.mlir
+5-132 files

LLVM/project 576e1afllvm/lib/Target/AMDGPU AMDGPUIGroupLP.cpp

[NFC][AMDGPU] IGLP: Fixes for unsigned int handling (#135090)

Fixes unsigned int underflows in
`MFMASmallGemmSingleWaveOpt::applyIGLPStrategy`.
DeltaFile
+3-3llvm/lib/Target/AMDGPU/AMDGPUIGroupLP.cpp
+3-31 files

LLVM/project 7541a70llvm/lib/Target/AMDGPU SIFoldOperands.cpp AMDGPU.td, llvm/test/CodeGen/AMDGPU bug-pk-f32-imm-fold.mir packed-fp32.ll

[AMDGPU] Don't fold an i64 immediate value if it can't be replicated from its lower 32-bit

On some targets, a packed f32 instruction can only read 32 bits from a scalar operand (SGPR or literal) and replicates the bits to both channels. In this case, we should not fold an immediate value if it can't be replicated from its lower 32-bit.
DeltaFile
+64-0llvm/test/CodeGen/AMDGPU/bug-pk-f32-imm-fold.mir
+41-0llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+5-4llvm/test/CodeGen/AMDGPU/packed-fp32.ll
+8-0llvm/lib/Target/AMDGPU/AMDGPU.td
+6-0llvm/lib/Target/AMDGPU/GCNSubtarget.h
+124-45 files

LLVM/project df56434llvm/lib/Target/AMDGPU AMDGPU.td GCNSubtarget.h

remove target feature
DeltaFile
+0-8llvm/lib/Target/AMDGPU/AMDGPU.td
+3-2llvm/lib/Target/AMDGPU/GCNSubtarget.h
+1-1llvm/lib/Target/AMDGPU/SIFoldOperands.cpp
+4-113 files

LLVM/project 04a1fd5llvm/test/CodeGen/RISCV cfi-multiple-locations.mir

[RISCV] Make XFAIL test UNSUPPORTED. (#168525)

Currently the test cfi-multiple-location.mir is marked as XFAIL. This
causes failures on some build bots because the test unexpectedly passes.

Mark this test as UNSUPPORTED for now. Later I plan to merge an MR which
fixes an issue in CFIInstrInserter and this test will be enabled.
DeltaFile
+1-1llvm/test/CodeGen/RISCV/cfi-multiple-locations.mir
+1-11 files

LLVM/project 58b8e6ellvm/lib/IR Verifier.cpp, llvm/test/Verifier diderivedtype-extradata-tuple.ll

[DebugInfo][IR] Verifier checks for the extraData (#167971)

LLVM IR verifier checks for `extraData` in debug info metadata. 

This is a follow-up PR based on discussions in #165023
DeltaFile
+55-0llvm/test/Verifier/diderivedtype-extradata-tuple.ll
+25-0llvm/lib/IR/Verifier.cpp
+80-02 files

LLVM/project ac6e48dllvm/include/llvm/DWP DWP.h DWPStringPool.h, llvm/lib/DWP DWP.cpp

Modify llvm-dwp to be able to emit string tables over 4GB without losing data (#167457)

We can change llvm-dwp to emit DWARF64 version of the .debug_str_offsets
tables for .dwo files in a .dwp file. This allows the string table to
exceed 4GB without truncating string offsets into the .debug_str section
and losing data. llvm-dwp will append all strings to the .debug_str
section for a .dwo file, and if any of the new string offsets exceed
UINT32_MAX, it will upgrade the .debug_str_offsets table to a DWARF64
header and then each string offset in that table can now have a 64 bit
offset.

Fixed LLDB to be able to successfully load the 64 bit string tables in
.dwp files.

Fixed llvm-dwarfdump and LLVM DWARF parsing code to do the right thing
with DWARF64 string table headers.
DeltaFile
+78-21llvm/lib/DWP/DWP.cpp
+81-0llvm/test/tools/llvm-dwp/X86/dwarf64-str-offsets.test
+26-1llvm/tools/llvm-dwp/llvm-dwp.cpp
+15-0llvm/tools/llvm-dwp/Opts.td
+11-2llvm/include/llvm/DWP/DWP.h
+3-3llvm/include/llvm/DWP/DWPStringPool.h
+214-276 files

LLVM/project 2ad93b4llvm/lib/Target/X86 X86ISelLowering.cpp

[X86] getRoundingModeX86 - add missing "clang-format on" toggle comment (#168588)

This was preventing later code to be formatted
DeltaFile
+6-6llvm/lib/Target/X86/X86ISelLowering.cpp
+6-61 files

LLVM/project e93763ellvm/tools/dsymutil Options.td

[dsymutil] Specify that -flat is for testing in the help output (#168590)

Gently discourage users from relying on -flat by specifying in the help
output that it's meant for testing.
DeltaFile
+1-1llvm/tools/dsymutil/Options.td
+1-11 files

LLVM/project 09fea8futils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] fix #168212
DeltaFile
+4-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+4-01 files

LLVM/project b630721utils/bazel/llvm-project-overlay/lldb BUILD.bazel

[bazel] Fix #164904 (#168593)

DeltaFile
+10-0utils/bazel/llvm-project-overlay/lldb/BUILD.bazel
+10-01 files

LLVM/project 0dd3cb5llvm/test/DebugInfo/AArch64 instr-ref-target-hooks-sp-clobber.mir

Reland instr-ref-target-hooks-sp-clobber.mir (#168136)

This test was failing on chromium builds with error:

```
/Volumes/Work/s/w/ir/x/w/llvm_build/bin/llc -o - /Volumes/Work/s/w/ir/x/w/llvm-llvm-project/llvm/test/DebugInfo/AArch64/instr-ref-target-hooks-sp-clobber.mir -run-pass=livedebugvalues | /Volumes/Work/s/w/ir/x/w/llvm_build/bin/FileCheck /Volumes/Work/s/w/ir/x/w/llvm-llvm-project/llvm/test/DebugInfo/AArch64/instr-ref-target-hooks-sp-clobber.mir # RUN: at line 8
+ /Volumes/Work/s/w/ir/x/w/llvm_build/bin/llc -o - /Volumes/Work/s/w/ir/x/w/llvm-llvm-project/llvm/test/DebugInfo/AArch64/instr-ref-target-hooks-sp-clobber.mir -run-pass=livedebugvalues
+ /Volumes/Work/s/w/ir/x/w/llvm_build/bin/FileCheck /Volumes/Work/s/w/ir/x/w/llvm-llvm-project/llvm/test/DebugInfo/AArch64/instr-ref-target-hooks-sp-clobber.mir
error: YAML:121:3: unknown key 'stackSizePPR'
  stackSizePPR:    0
  ^~~~~~~~~~~~

FileCheck error: '<stdin>' is empty.
FileCheck command line:  /Volumes/Work/s/w/ir/x/w/llvm_build/bin/FileCheck /Volumes/Work/s/w/ir/x/w/llvm-llvm-project/llvm/test/DebugInfo/AArch64/instr-ref-target-hooks-sp-clobber.mir
```

This is an attempt to reland the failing test
DeltaFile
+188-0llvm/test/DebugInfo/AArch64/instr-ref-target-hooks-sp-clobber.mir
+188-01 files

LLVM/project 6ade08ellvm/lib/Target/Mips Mips16ISelLowering.cpp

Mips: Remove manual libcall name search and table

This should really check if the libcall is known supported.
For now mips doesn't configure its RuntimeLibcallsInfo
correctly, and does not have any of the mips16 calls in it.
For now there isn't a way to add them without triggering conflicting
cases in tablegen, so keep parsing the raw name as it was before.
DeltaFile
+32-67llvm/lib/Target/Mips/Mips16ISelLowering.cpp
+32-671 files

LLVM/project 8138f3amlir/include/mlir/Dialect/XeGPU/IR XeGPUOps.td

change documentation
DeltaFile
+2-1mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+2-11 files

LLVM/project 96e58b8llvm/lib/CodeGen/SelectionDAG SelectionDAG.cpp, llvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVISelLowering.h

[RISCV] Legalize misaligned unmasked vp.load/vp.store to vle8/vse8. (#167745)

If vector-unaligned-mem support is not enabled, we should not generate
loads/stores that are not aligned to their element size.

We already do this for non-VP vector loads/stores.

This code has been in our downstream for about a year and a half after
finding the vectorizer generating misaligned loads/stores. I don't think
that is unique to our downstream.

Doing this for masked vp.load/store requires widening the mask as well
which is harder to do.

NOTE: Because we have to scale the VL, this will introduce additional
vsetvli and the VL optimizer will not be effective at optimizing any
arithmetic that is consumed by the store.
DeltaFile
+101-2llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+13-0llvm/test/CodeGen/RISCV/rvv/vpstore.ll
+13-0llvm/test/CodeGen/RISCV/rvv/vpload.ll
+9-2llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
+2-2llvm/test/CodeGen/RISCV/rvv/fixed-vectors-shuffle-exact-vlen.ll
+3-0llvm/lib/Target/RISCV/RISCVISelLowering.h
+141-66 files

LLVM/project 4ab2423llvm/lib/Analysis ConstantFolding.cpp, llvm/test/Transforms/InstSimplify/ConstProp vector-calls.ll

[ConstantFolding] Generalize constant folding for vector_interleave2 to interleave3-8. (#168473)

DeltaFile
+48-0llvm/test/Transforms/InstSimplify/ConstProp/vector-calls.ll
+20-7llvm/lib/Analysis/ConstantFolding.cpp
+68-72 files

LLVM/project 8f67759llvm/include/llvm/TableGen CodeGenHelpers.h, llvm/utils/TableGen/Basic DirectiveEmitter.cpp

[NFC][TableGen] Remove `close` member from various CodeGenHelpers (#167904)

Always rely on local scopes to enforce the lifetime of these helper
objects and by extension where the "closing" of various C++ code
constructs happens.
DeltaFile
+60-59llvm/utils/TableGen/Basic/DirectiveEmitter.cpp
+38-35mlir/tools/mlir-tblgen/EnumsGen.cpp
+3-27llvm/include/llvm/TableGen/CodeGenHelpers.h
+101-1213 files

LLVM/project 281393emlir/include/mlir/Dialect/XeGPU/IR XeGPUTypes.td XeGPUOps.td, mlir/test/Dialect/XeGPU invalid.mlir

address feedback
DeltaFile
+11-0mlir/include/mlir/Dialect/XeGPU/IR/XeGPUTypes.td
+1-7mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+1-1mlir/test/Dialect/XeGPU/invalid.mlir
+13-83 files

LLVM/project 7fb01acutils/bazel/llvm-project-overlay/lldb BUILD.bazel

[bazel] Fix #164904
DeltaFile
+10-0utils/bazel/llvm-project-overlay/lldb/BUILD.bazel
+10-01 files

LLVM/project 3f61402clang/test/ClangScanDeps module-in-stable-dir-by-name.c

[clang][DependencyScanning] Add Test Coverage of `StabeDirs` during By-Name Lookups (#168143)

This PR adds some test coverage for `StableDirs` during by-name lookups.
DeltaFile
+43-0clang/test/ClangScanDeps/module-in-stable-dir-by-name.c
+43-01 files

LLVM/project 46565f3lldb/include/lldb/Utility LLDBLog.h, lldb/source/Plugins/InstrumentationRuntime/Utility ReportRetriever.cpp

[LLDB] Add log channel for InstrumentationRuntime plugins (#168508)

This patch adds `LLDBLog::InstrumentationRuntime` as a log channel to
provide an appropriate channel for instrumentation runtime plugins as
previously one did not exist.

A small use of the channel is added to illustrate its use. The logging
added is not intended to be comprehensive.

This is primarily motivated by an `-fbounds-safety` instrumentation
plugin (https://github.com/swiftlang/llvm-project/pull/11835).

rdar://164920875
DeltaFile
+4-1lldb/source/Plugins/InstrumentationRuntime/Utility/ReportRetriever.cpp
+3-0lldb/source/Utility/LLDBLog.cpp
+2-1lldb/include/lldb/Utility/LLDBLog.h
+9-23 files

LLVM/project 523bd2dllvm/lib/CodeGen/GlobalISel LegalizerHelper.cpp, llvm/lib/Target/RISCV/GISel RISCVLegalizerInfo.cpp

[GISel][RISCV] Compute CTPOP of small odd-sized integer correctly (#168559)

Fixes the assertion in #168523
This patch lifts the small, odd-sized integer to 8 bits, ensuring that
the following lowering code behaves correctly.
DeltaFile
+140-0llvm/test/CodeGen/RISCV/GlobalISel/bitmanip.ll
+112-0llvm/test/CodeGen/RISCV/GlobalISel/legalizer/legalize-ctpop-rv64.mir
+4-1llvm/lib/Target/RISCV/GISel/RISCVLegalizerInfo.cpp
+4-0llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
+260-14 files

LLVM/project 0ae2bccllvm/lib/Target/ARM ARMISelLowering.h ARMISelLowering.cpp

[ARM] TableGen-erate node descriptions (#168212)

This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.

Some nodes fail validation, those are enumerated in
`ARMSelectionDAGInfo::verifyTargetNode()`. Some of the bugs are easy to
fix, but probably they should be fixed separately, this patch is already big.

Part of #119709.

Pull Request: https://github.com/llvm/llvm-project/pull/168212
DeltaFile
+0-315llvm/lib/Target/ARM/ARMISelLowering.h
+3-217llvm/lib/Target/ARM/ARMISelLowering.cpp
+138-4llvm/lib/Target/ARM/ARMInstrInfo.td
+81-2llvm/lib/Target/ARM/ARMSelectionDAGInfo.cpp
+64-2llvm/lib/Target/ARM/ARMSelectionDAGInfo.h
+55-0llvm/lib/Target/ARM/ARMInstrMVE.td
+341-5406 files not shown
+382-54112 files

LLVM/project 40ed57c.ci utils.sh monolithic-linux.sh

[CI] Prefer Bash Tests over Empty String Comparisons (#168575)

These are more idiomatic in bash.
DeltaFile
+5-5.ci/utils.sh
+3-3.ci/monolithic-linux.sh
+2-2.ci/monolithic-windows.sh
+10-103 files

LLVM/project 8bdd82c.ci premerge_advisor_explain.py premerge_advisor_upload.py

[CI] Skip Running Premerge Advisor on AArch64 (#168404)

They were still running because the conditional was not correct. This
patch fixes that so they do not interefere with the results of the job.
DeltaFile
+1-1.ci/premerge_advisor_explain.py
+1-1.ci/premerge_advisor_upload.py
+2-22 files

LLVM/project 5407e62mlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

Revert "[MLIR][NVVM] Add tcgen05.mma MLIR Ops" (#168583)

Reverts llvm/llvm-project#164356

The bots are broken.
DeltaFile
+0-634mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-sp-tensor.mlir
+0-633mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-tensor.mlir
+0-612mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+0-545mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+0-442mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-shared.mlir
+0-442mlir/test/Target/LLVMIR/nvvm/tcgen05-mma-sp-shared.mlir
+0-3,3089 files not shown
+0-4,87515 files

LLVM/project 158edd6llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll amdgcn.bitcast.512bit.ll, llvm/test/MC/Disassembler/AMDGPU gfx11_dasm_vop3_dpp16.txt

Merge branch 'main' into create-mem-desc-from-2d-memref
DeltaFile
+36,400-36,393llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+11,724-10,707llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.512bit.ll
+4,719-5,242llvm/test/MC/Disassembler/AMDGPU/gfx11_dasm_vop3_dpp16.txt
+3,820-3,075llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+3,688-2,998llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.896bit.ll
+3,337-2,705llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.832bit.ll
+63,688-61,1205,203 files not shown
+247,683-166,8665,209 files

LLVM/project 1e13a5cflang/test/Semantics/OpenMP loop-transformation-construct01.f90

[flang][OpenMP] Fix some typo-like things in test case
DeltaFile
+12-12flang/test/Semantics/OpenMP/loop-transformation-construct01.f90
+12-121 files

LLVM/project 5af0398llvm/test/CodeGen/X86 merge-consecutive-loads-128.ll merge-consecutive-loads-256.ll

[X86] Add test examples of build vectors of reversed scalar loads that could be converted to vector loads plus shuffles (#168571)

This is turning up in some legalisation code when shuffling vectors bitcast from illegal loads.

Ideally we'd handle more complex shuffles, but reverse is a start.
DeltaFile
+520-0llvm/test/CodeGen/X86/merge-consecutive-loads-128.ll
+352-0llvm/test/CodeGen/X86/merge-consecutive-loads-256.ll
+324-0llvm/test/CodeGen/X86/merge-consecutive-loads-512.ll
+1,196-03 files