LLVM/project 3cba379llvm/lib/Transforms/Vectorize VPlan.h LoopVectorizationPlanner.h

[VPlan] Populate and use VPIRMetadata from VPInstructions (NFC) (#167253)

Update VPlan to populate VPIRMetadata during VPInstruction construction
and use it when creating widened recipes, instead of constructing
VPIRMetadata from the underlying IR instruction each time.

This centralizes VPIRMetadata in VPInstructions and ensures metadata is
consistently available throughout VPlan transformations.

PR: https://github.com/llvm/llvm-project/pull/167253
DeltaFile
+28-34llvm/lib/Transforms/Vectorize/VPlan.h
+21-20llvm/lib/Transforms/Vectorize/LoopVectorizationPlanner.h
+20-19llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+29-8llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+19-16llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+0-12llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+117-1094 files not shown
+127-12310 files

LLVM/project 24fa2adllvm/utils/TableGen/Common CodeGenInstAlias.cpp

[𝘀𝗽𝗿] initial version

Created using spr 1.3.8-beta.1
DeltaFile
+3-2llvm/utils/TableGen/Common/CodeGenInstAlias.cpp
+3-21 files

LLVM/project ed617bdutils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel][buildifier] reformat changes in #168434 (#168443)

DeltaFile
+1-1utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+1-11 files

LLVM/project 0d8c294llvm/tools/llvm-objdump OtoolOpts.td

Fixed typo in llvm-otool (#168395)

DeltaFile
+1-1llvm/tools/llvm-objdump/OtoolOpts.td
+1-11 files

LLVM/project becf0f0utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel][buildifier] reformat changes in #168434
DeltaFile
+1-1utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+1-11 files

LLVM/project 7693f12utils/bazel/llvm-project-overlay/mlir BUILD.bazel

[mlir][bazel] Fix #167957 (#168441)

DeltaFile
+3-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+3-01 files

LLVM/project aae2b89libcxx/test/std/input.output/file.streams/c.files gets.compile.fail.cpp, libcxx/test/std/re/re.iter/re.tokiter/re.tokiter.cnstr temporary-objects.verify.cpp vector.compile.fail.cpp

[libc++] Replace a few .compile.fail.cpp tests by proper clang-verify tests (#167346)

We want to eliminate all .compile.fail.cpp tests since they are brittle:
these tests pass regardless of the specific compilation error, which
means that e.g. a mising include will render the test null.

This is not an exhaustive pass, just a few tests I stumbled upon.
DeltaFile
+72-0libcxx/test/std/re/re.iter/re.tokiter/re.tokiter.cnstr/temporary-objects.verify.cpp
+0-41libcxx/test/std/re/re.iter/re.tokiter/re.tokiter.cnstr/vector.compile.fail.cpp
+0-40libcxx/test/std/re/re.iter/re.tokiter/re.tokiter.cnstr/array.compile.fail.cpp
+0-37libcxx/test/std/re/re.iter/re.tokiter/re.tokiter.cnstr/init.compile.fail.cpp
+0-36libcxx/test/std/re/re.iter/re.tokiter/re.tokiter.cnstr/int.compile.fail.cpp
+0-21libcxx/test/std/input.output/file.streams/c.files/gets.compile.fail.cpp
+72-1751 files not shown
+89-1757 files

LLVM/project 24c524dlibcxx/docs VendorDocumentation.rst, libcxx/utils/ci run-buildbot

[libc++] Enable compiler-rt when performing a bootstrapping build (#167065)

Otherwise, we end up using whatever system-provided compiler runtime is
available, which doesn't work on macOS since compiler-rt is located
inside the toolchain path, which can't be found by default.

However, disable the tests for compiler-rt since those are linking
against the system C++ standard library while using the just-built
libc++ headers, which is non-sensical and leads to undefined references
on macOS.
DeltaFile
+8-6libcxx/docs/VendorDocumentation.rst
+2-1libcxx/utils/ci/run-buildbot
+10-72 files

LLVM/project 54c2c7clldb/packages/Python/lldbsuite/test/make Makefile.rules, lldb/test/API/commands/target/auto-install-main-executable Makefile

[LLDB] Fix test compilation errors under asan (NFC) (#168408)

https://green.lab.llvm.org/job/llvm.org/view/LLDB/job/lldb-cmake-sanitized/2744/consoleText
DeltaFile
+3-3lldb/test/API/macosx/posix_spawn/Makefile
+2-2lldb/test/API/macosx/find-dsym/bundle-with-dot-in-filename/Makefile
+2-2lldb/test/API/macosx/find-dsym/deep-bundle/Makefile
+3-1lldb/packages/Python/lldbsuite/test/make/Makefile.rules
+1-1lldb/test/API/commands/target/auto-install-main-executable/Makefile
+11-95 files

LLVM/project 321b9d1llvm/lib/Transforms/Vectorize VPlan.h VPlanConstruction.cpp

[VPlan] Replace VPIRMetadata::addMetadata with setMetadata. (NFC)

Replace addMetadata with setMetadata, which sets metadata, updating
existing entries or adding a new entry otherwise.

This isn't strictly needed at the moment, but will be needed for
follow-up patches.
DeltaFile
+11-8llvm/lib/Transforms/Vectorize/VPlan.h
+3-3llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+1-1llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+15-123 files

LLVM/project b00588f

Fix bazel dep caused by f5b73760 (#168436)

DeltaFile
+0-00 files

LLVM/project 4bec74autils/bazel/llvm-project-overlay/mlir BUILD.bazel

[mlir][bazel] Fix #168066 (#168435)

DeltaFile
+1-0utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+1-01 files

LLVM/project 3fb3742utils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] Fix #168113 (#168434)

DeltaFile
+4-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+4-01 files

LLVM/project bac8d01utils/bazel/llvm-project-overlay/libc BUILD.bazel

[bazel][libc] Fixes #165219 (#168429)

DeltaFile
+1-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+1-01 files

LLVM/project 557a6b8lldb/source/Symbol Symtab.cpp

[lldb][NFC] use llvm::erase_if to remove non matching types (#168279)

DeltaFile
+10-18lldb/source/Symbol/Symtab.cpp
+10-181 files

LLVM/project 41b52fbutils/bazel/llvm-project-overlay/libc BUILD.bazel

[bazel][libc] Fixes #165219
DeltaFile
+1-0utils/bazel/llvm-project-overlay/libc/BUILD.bazel
+1-01 files

LLVM/project b32c434.github/workflows libc-fullbuild-tests.yml, libc/cmake/caches baremetal_common.cmake armv8.1m.main-none-eabi.cmake

[libc][Github] Perform baremetal libc builds (#167583)

Currently there are no 32 bit presubmit builds for libc. This PR
performs 32 bit build only (no test) to check any changes that land in
libc break 32 bit builds.

Co-authored-by: Aiden Grossman <aidengrossman at google.com>
DeltaFile
+67-16.github/workflows/libc-fullbuild-tests.yml
+21-0libc/cmake/caches/baremetal_common.cmake
+8-0libc/cmake/caches/armv8.1m.main-none-eabi.cmake
+8-0libc/cmake/caches/armv8m.main-none-eabi.cmake
+8-0libc/cmake/caches/armv7m-none-eabi.cmake
+8-0libc/cmake/caches/armv6m-none-eabi.cmake
+120-162 files not shown
+132-168 files

LLVM/project 320c18allvm/lib/Target/SystemZ SystemZISelLowering.h SystemZOperators.td

[SystemZ] TableGen-erate node descriptions (#168113)

This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.

There is only one node that is missing a description -- `GET_CCMASK`,
others were successfully imported.

Part of #119709.

Pull Request: https://github.com/llvm/llvm-project/pull/168113
DeltaFile
+0-386llvm/lib/Target/SystemZ/SystemZISelLowering.h
+257-22llvm/lib/Target/SystemZ/SystemZOperators.td
+0-147llvm/lib/Target/SystemZ/SystemZISelLowering.cpp
+24-5llvm/lib/Target/SystemZ/SystemZSelectionDAGInfo.h
+13-7llvm/lib/Target/SystemZ/SystemZSelectionDAGInfo.cpp
+1-0llvm/lib/Target/SystemZ/CMakeLists.txt
+295-5676 files

LLVM/project 0d2a996llvm/test/CodeGen/AMDGPU regalloc-spill-wmma-scale.ll

[AMDGPU] Add baseline test to show spilling of wmma scale. NFC

This is to show the spilling of WMMA scale values which are limited
to low 256 VGPRs. We have free registers, just RA allocates low 256
first.
DeltaFile
+131-0llvm/test/CodeGen/AMDGPU/regalloc-spill-wmma-scale.ll
+131-01 files

LLVM/project 5c722a6llvm/lib/Target/AMDGPU SIRegisterInfo.h SIRegisterInfo.td, llvm/test/CodeGen/AMDGPU regalloc-spill-wmma-scale.ll

[AMDGPU] Prioritize allocation of low 256 VGPR classes

If we have 1024 VGPRs available we need to give priority to the
allocation of these registers where operands can only use low 256.
That is noteably scale operands of V_WMMA_SCALE instructions.
Otherwise large tuples will be allocated first and take all low
registers, so we would have to spill to get a room for these
scale registers.

Allocation priority itself does not eliminate spilling completely
in large kernels, although helps to some degree. Increasing spill
weight of a restricted class on top of it helps.
DeltaFile
+130-0llvm/test/CodeGen/AMDGPU/regalloc-spill-wmma-scale.ll
+11-0llvm/lib/Target/AMDGPU/SIRegisterInfo.h
+1-1llvm/lib/Target/AMDGPU/SIRegisterInfo.td
+142-13 files

LLVM/project 21e0b56llvm/test/CodeGen/AArch64 llround-conv.ll lround-conv.ll

[AArch64][GlobalISel] Add basic GISel test coverage for lround and llround. NFC
DeltaFile
+41-26llvm/test/CodeGen/AArch64/llround-conv.ll
+35-20llvm/test/CodeGen/AArch64/lround-conv.ll
+6-2llvm/test/CodeGen/AArch64/llround-conv-fp16.ll
+6-2llvm/test/CodeGen/AArch64/lround-conv-fp16.ll
+88-504 files

LLVM/project 24b8b47llvm/lib/Target/AArch64 AArch64InstrInfo.cpp, llvm/test/CodeGen/AArch64 licm-regclass-copy.mir

[AArch64] Treat COPY between cross-register banks as expensive

The motivation is to allow passes such as MachineLICM to hoist trivial
FMOV instructions out of loops, where previously it didn't do so even
when the RHS is a constant.
On most architectures, these expensive move instructions have a latency
of 2-6 cycles, and certainly not cheap as a 0-1 cycle move.
DeltaFile
+197-0llvm/test/CodeGen/AArch64/licm-regclass-copy.mir
+25-0llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+222-02 files

LLVM/project 69b4190llvm/lib/Target/AArch64 AArch64ISelLowering.cpp, llvm/test/CodeGen/AArch64 aarch64-load-ext.ll andorxor.ll

[AArch64] Optimize extending loads of small vectors (#163064)

Reduces the total amount of loads and the amount of moves between SIMD
registers and general-purpose registers.
DeltaFile
+236-28llvm/test/CodeGen/AArch64/aarch64-load-ext.ll
+115-33llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+36-45llvm/test/CodeGen/AArch64/andorxor.ll
+33-34llvm/test/CodeGen/AArch64/vec3-loads-ext-trunc-stores.ll
+10-17llvm/test/CodeGen/AArch64/sub.ll
+10-17llvm/test/CodeGen/AArch64/mul.ll
+440-17419 files not shown
+567-34825 files

LLVM/project 44e81c6llvm/lib/Target/AMDGPU SIInstrInfo.cpp, llvm/test/CodeGen/AMDGPU twoaddr-wmma.mir

AMDGPU: Don't duplicate implicit operands in 3-address conversion

We previously got a duplicate implicit $exec operand. It didn't really
hurt anything (other than being a slight drag on compile-time
performance). Still, let's keep things clean.

commit-id:203b6f66
DeltaFile
+12-12llvm/test/CodeGen/AMDGPU/twoaddr-wmma.mir
+2-2llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+14-142 files

LLVM/project c1c22cdclang/lib/Headers __clang_hip_libdevice_declares.h

[ASan][HIP] Add ASan declarations and macros. (#167522)

This patch adds the following device ASan hooks and guarded macros in
__clang_hip_libdevice_declares.h

  - Function Declarations
    - __asan_poison_memory_region
    - __asan_unpoison_memory_region
    - __asan_address_is_poisoned
    - __asan_region_is_poisoned

  - Macros
    - ASAN_POISON_MEMORY_REGION
    - ASAN_UNPOISON_MEMORY_REGION
DeltaFile
+17-0clang/lib/Headers/__clang_hip_libdevice_declares.h
+17-01 files

LLVM/project c555522lldb/test/API/tools/lldb-dap/evaluate TestDAP_evaluate.py, lldb/tools/lldb-dap LLDBUtils.cpp

[lldb-dap] Migrating 'evaluate' to structured types. (#167720)

Adding structured types for the evaluate request handler.

This should be mostly a non-functional change. I did catch some spelling
mistakes in our tests ('variable' vs 'variables').
DeltaFile
+70-197lldb/tools/lldb-dap/Handler/EvaluateRequestHandler.cpp
+152-44lldb/test/API/tools/lldb-dap/evaluate/TestDAP_evaluate.py
+117-0lldb/tools/lldb-dap/Protocol/ProtocolRequests.h
+51-0lldb/unittests/DAP/ProtocolRequestsTest.cpp
+49-0lldb/tools/lldb-dap/Protocol/ProtocolRequests.cpp
+7-4lldb/tools/lldb-dap/LLDBUtils.cpp
+446-2456 files not shown
+463-25612 files

LLVM/project af6af8eutils/bazel/llvm-project-overlay/llvm BUILD.bazel

[bazel] Port 0a58e49c44ae7cca39b3eb219efed9f0581b8b0f (#168424)

DeltaFile
+4-0utils/bazel/llvm-project-overlay/llvm/BUILD.bazel
+4-01 files

LLVM/project bafb3f6llvm/test/Transforms/LoopVectorize metadata.ll

[LV] Add test with existing noalias metadata and runtime checks.

Add test where we have loads with existing noalias metadata and noalias
metadata gets added by loop versioning.
DeltaFile
+148-0llvm/test/Transforms/LoopVectorize/metadata.ll
+148-01 files

LLVM/project cd5d5b3mlir/include/mlir/Dialect/XeGPU/IR XeGPUOps.td, mlir/lib/Dialect/XeGPU/IR XeGPUOps.cpp

[mlir][XeGPU] Use DistributeLayoutAttr instead of LayoutAttr for load gather/scatter ops (#167850)

The PR changes the layout attribute type for
`xegpu::LoadGatherOp/StoreScatterOp` from `LayoutAttr` to
`DistributeLayoutAttr` to also support `xegpu.slice` layouts.

Initially we [wanted to restrict slice
layouts](https://github.com/llvm/llvm-project/pull/163414#discussion_r2478978798)
from the attribute, but now it turns out there are actually valid use
cases for that:
```mlir
gpu.func @distribute_load_slice_attr() {
  %2 = memref.alloca() {alignment = 1024} : memref<4096xf32>
  %offset =  arith.constant {layout_result_0 = #xegpu.layout<sg_layout = [8], sg_data = [32], inst_data = [16]> } dense<0> : vector<256xindex>
  %mask = arith.constant {layout_result_0 = #xegpu.layout<sg_layout = [8], sg_data = [32], inst_data = [16]> } dense<1> : vector<256xi1>

  %3 = xegpu.load %2[%offset], %mask <{chunk_size = 1, layout = #xegpu.slice<#xegpu.layout<sg_layout = [8, 8], sg_data = [32, 32], inst_data = [8, 16]>, dims = [0]>>} {
      layout_result_0 = #xegpu.slice<#xegpu.layout<sg_layout = [8, 8], sg_data = [32, 32], inst_data = [8, 16]>, dims = [0]> 
  } : memref<4096xf32>, vector<256xindex>, vector<256xi1> -> vector<256xf32>

    [7 lines not shown]
DeltaFile
+17-0mlir/test/Dialect/XeGPU/xegpu-wg-to-sg-unify-ops.mlir
+7-5mlir/lib/Dialect/XeGPU/Transforms/XeGPUWgToSgDistribute.cpp
+4-4mlir/include/mlir/Dialect/XeGPU/IR/XeGPUOps.td
+2-2mlir/lib/Dialect/XeGPU/IR/XeGPUOps.cpp
+2-2mlir/lib/Dialect/XeGPU/Transforms/XeGPUUnroll.cpp
+32-135 files

LLVM/project 7672a5ccompiler-rt/lib/scudo/standalone primary64.h

[scudo] Fix wrong return type. (#168157)

DeltaFile
+1-1compiler-rt/lib/scudo/standalone/primary64.h
+1-11 files