LLVM/project 925fb05llvm/lib/Target/RISCV RISCVISelLowering.cpp, llvm/test/CodeGen/RISCV/GlobalISel/irtranslator vec-vleff.ll

[RISCV][GISel] Fallback to SelectionDAG for vleff intrinsics. (#167776)

Supporting this in GISel requires multiple changes to IRTranslator to
support aggregate returns containing scalable vectors and non-scalable
types. Falling back is the quickest way to fix the crash.

Fixes #167618
DeltaFile
+14-0llvm/test/CodeGen/RISCV/GlobalISel/irtranslator/vec-vleff.ll
+4-2llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+18-22 files

LLVM/project 5feddcfllvm/test/CodeGen/AMDGPU a-v-flat-atomicrmw.ll a-v-global-atomicrmw.ll

Update regressed tests
DeltaFile
+1,570-1,557llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll
+1,136-1,130llvm/test/CodeGen/AMDGPU/a-v-global-atomicrmw.ll
+186-171llvm/test/CodeGen/AMDGPU/vni8-across-blocks.ll
+9-8llvm/test/CodeGen/AMDGPU/copy-to-reg-frameindex.ll
+2,901-2,8664 files

LLVM/project 9c3e46bllvm/test/CodeGen/AMDGPU global-atomicrmw-fmin.ll global-atomicrmw-fmax.ll

AMDGPU: Really use AV classes by default for vector classes

Update getRegClassFor to use AV classes in place of VGPRs for
gfx90a-gfx950. There are a handful of regressions. Most are
enabling unprofitable rematerialization which reduce register
count by 1 but add an unnecessary instruction.
DeltaFile
+524-524llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll
+524-524llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll
+520-524llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
+520-524llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
+436-440llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
+432-432llvm/test/CodeGen/AMDGPU/global-atomicrmw-fsub.ll
+2,956-2,96824 files not shown
+4,692-4,71330 files

LLVM/project 7f9c3bbllvm/test/CodeGen/AMDGPU a-v-global-atomicrmw.ll global-atomicrmw-fmin.ll

32-bitcase

Note this does very little because we only use VGPR classes
for FP types (though this doesn't particularly make any sense),
and we legalize normal loads and stores to integer.
DeltaFile
+190-190llvm/test/CodeGen/AMDGPU/a-v-global-atomicrmw.ll
+140-140llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmin.ll
+140-140llvm/test/CodeGen/AMDGPU/global-atomicrmw-fmax.ll
+136-140llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
+136-140llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
+78-82llvm/test/CodeGen/AMDGPU/mfma-loop.ll
+820-83210 files not shown
+997-1,00616 files

LLVM/project f111afallvm/test/CodeGen/AMDGPU a-v-flat-atomicrmw.ll no-fold-accvgpr-mov.ll

Regression with 32-bit case
DeltaFile
+246-238llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll
+7-5llvm/test/CodeGen/AMDGPU/no-fold-accvgpr-mov.ll
+253-2432 files

LLVM/project ccde342llvm/lib/Target/AMDGPU AMDGPU.td GCNSubtarget.h, llvm/test/CodeGen/AMDGPU flat-saddr-atomics.ll global-atomicrmw-fadd.ll

[AMDGPU] Insert `s_wait_xcnt(0)` before atomics to work around write-combining miss hazard

This patch adds a workaround for a hazzard on GFX1250, which inserts an `s_wait_xcnt(0)` instruction before any atomic operation that might write to memory.

Fixes SWDEV-543703.
DeltaFile
+188-0llvm/test/CodeGen/AMDGPU/flat-saddr-atomics.ll
+72-0llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
+56-0llvm/test/CodeGen/AMDGPU/atomics-system-scope.ll
+9-0llvm/lib/Target/AMDGPU/AMDGPU.td
+6-0llvm/test/CodeGen/AMDGPU/fp64-atomics-gfx90a.ll
+5-1llvm/lib/Target/AMDGPU/GCNSubtarget.h
+336-17 files not shown
+356-213 files

LLVM/project 4d47649llvm/lib/Target/AMDGPU SIInsertWaitcnts.cpp

remove unnecessary `mayStore` check
DeltaFile
+1-1llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+1-11 files

LLVM/project 1337723llvm/lib/Target/AMDGPU SIISelLowering.cpp, llvm/test/CodeGen/AMDGPU a-v-flat-atomicrmw.ll a-v-global-atomicrmw.ll

AMDGPU: Start to use AV classes for unknown vector class

Use AGPR+VGPR superclasses for gfx90a+. The type used
for the class should be the broadest possible class, to
be contextually restricted later. InstrEmitter clamps these
to the common subclass of the context use instructions, so we're
best off using the broadest possible class for all types.

Note this does very little because we only use VGPR classes
for FP types (though this doesn't particularly make any sense),
and we legalize normal loads and stores to integer.
DeltaFile
+280-280llvm/test/CodeGen/AMDGPU/a-v-flat-atomicrmw.ll
+140-140llvm/test/CodeGen/AMDGPU/a-v-global-atomicrmw.ll
+70-74llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fsub.ll
+30-34llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmax.ll
+30-34llvm/test/CodeGen/AMDGPU/flat-atomicrmw-fmin.ll
+24-17llvm/lib/Target/AMDGPU/SIISelLowering.cpp
+574-57910 files not shown
+662-65516 files

LLVM/project dfe9838clang/lib/Format QualifierAlignmentFixer.cpp, clang/unittests/Format QualifierFixerTest.cpp

[clang-format] Don't swap `(const override)` with QAS_Right (#167191)

Fixes #154846
DeltaFile
+13-4clang/lib/Format/QualifierAlignmentFixer.cpp
+2-0clang/unittests/Format/QualifierFixerTest.cpp
+15-42 files

LLVM/project e58e799llvm/lib/Target/AMDGPU AMDGPU.td, llvm/test/CodeGen/AMDGPU flat-saddr-atomics.ll global-atomicrmw-fadd.ll

[AMDGPU] Insert `s_wait_xcnt(0)` before atomics to work around write-combining miss hazard

This patch adds a workaround for a hazzard on GFX1250, which inserts an `s_wait_xcnt(0)` instruction before any atomic operation that might write to memory.

Fixes SWDEV-543703.
DeltaFile
+188-0llvm/test/CodeGen/AMDGPU/flat-saddr-atomics.ll
+72-0llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
+56-0llvm/test/CodeGen/AMDGPU/atomics-system-scope.ll
+9-0llvm/lib/Target/AMDGPU/AMDGPU.td
+6-0llvm/test/CodeGen/AMDGPU/fp64-atomics-gfx90a.ll
+6-0llvm/test/CodeGen/AMDGPU/GlobalISel/fp64-atomics-gfx90a.ll
+337-07 files not shown
+356-213 files

LLVM/project 7aaa6b7clang-tools-extra/clang-doc HTMLMustacheGenerator.cpp Generators.cpp

[clang-doc] lift Mustache template generation from HTML

To prepare for more backends to use Mustache templates, this patch lifts
the Mustache functionality from HTMLMustacheGenerator.cpp to
Generators.h. A MustacheGenerator interface is created to share code for
template creation.
DeltaFile
+28-174clang-tools-extra/clang-doc/HTMLMustacheGenerator.cpp
+130-0clang-tools-extra/clang-doc/Generators.cpp
+83-0clang-tools-extra/clang-doc/Generators.h
+241-1743 files

LLVM/project 73e70e0mlir/test/Integration/Dialect/Linalg/CPU runtime-verification.mlir

[mlir][linalg] Fix Linalg runtime verification test (#167814)

This integration test has been broken for a while. This commit partially
fixes it.

- Use `CHECK` + `CHECK-NEXT` to ensure that the correct error lines are
matched together.
- Move all `CHECK-NOT` to the end. Having a `CHECK` with the same string
does not make sense after a `CHECK-NOT`.
- Add a missing `CHECK: ERROR` for one of the test cases.
- Deactivate `reverse_from_3`, which is broken, and put a TODO.
DeltaFile
+71-64mlir/test/Integration/Dialect/Linalg/CPU/runtime-verification.mlir
+71-641 files

LLVM/project d4e9982llvm/docs AMDGPUUsage.rst

[AMDGPU] Document meaning of alignment of buffer fat pointers, intrinsics (#167553)

This commit adds documentation clarifying the meaning of `align` on ptr
addrpsace(7) (buffer fat pointer) and ptr addrspace(9) (bufferef
structured pointer) operations (specifying that both the base and the
offset need to be aligned) and documents the meaning of the `align`
attribute when used as an argument on *.buffer.ptr.* intrinsics.
DeltaFile
+34-0llvm/docs/AMDGPUUsage.rst
+34-01 files

LLVM/project c764ee6llvm/lib/Target/RISCV RISCVISelLowering.cpp RISCVISelDAGToDAG.cpp

[RISCV] Remove custom legalization of v2i16/v4i8 loads for P extension. (#167651)

We can use the default legalization which will create an i32 load
followed by a v2i32 scalar_to_vector followed by a bitcast. We can isel
the scalar_to_vector like a bitcast and not generate any instructions
for it.
DeltaFile
+0-17llvm/lib/Target/RISCV/RISCVISelLowering.cpp
+10-0llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
+10-172 files

LLVM/project 329dec9mlir/include/mlir/Reducer ReductionPatternInterface.h Tester.h, mlir/lib/Reducer ReductionTreePass.cpp

[MLIR] Add reduction interface with tester to mlir-reduce (#166096)

Currently, we don't have support for patterns that need access to a
`Tester` instance in `mlir-reduce`. This PR adds
`DialectReductionPatternWithTesterInterface` to the set of supported
interfaces. Dialects can implement this interface to inject the tester
into their pattern classes.
DeltaFile
+13-5mlir/lib/Reducer/ReductionTreePass.cpp
+9-1mlir/include/mlir/Reducer/ReductionPatternInterface.h
+6-0mlir/include/mlir/Reducer/Tester.h
+28-63 files

LLVM/project 0033198llvm/lib/Target/AMDGPU AMDGPU.td SIInsertWaitcnts.cpp, llvm/test/CodeGen/AMDGPU flat-saddr-atomics.ll global-atomicrmw-fadd.ll

[AMDGPU] Insert `s_wait_xcnt(0)` before atomics to work around write-combining miss hazard

This patch adds a workaround for a hazzard on GFX1250, which inserts an `s_wait_xcnt(0)` instruction before any atomic operation that might write to memory.

Fixes SWDEV-543703.
DeltaFile
+188-0llvm/test/CodeGen/AMDGPU/flat-saddr-atomics.ll
+72-0llvm/test/CodeGen/AMDGPU/global-atomicrmw-fadd.ll
+56-0llvm/test/CodeGen/AMDGPU/atomics-system-scope.ll
+9-0llvm/lib/Target/AMDGPU/AMDGPU.td
+6-0llvm/test/CodeGen/AMDGPU/GlobalISel/fp64-atomics-gfx90a.ll
+6-0llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+337-07 files not shown
+356-213 files

LLVM/project 622d52dclang/test/Driver hip-temps-linux.hip

clang: Only prevent hip driver test from running on windows (#167623)

DeltaFile
+4-4clang/test/Driver/hip-temps-linux.hip
+4-41 files

LLVM/project 1086fe7clang-tools-extra/clang-doc HTMLMustacheGenerator.cpp Generators.cpp

[clang-doc] lift Mustache template generation from HTML

To prepare for more backends to use Mustache templates, this patch lifts
the Mustache functionality from HTMLMustacheGenerator.cpp to
Generators.h. A MustacheGenerator interface is created to share code for
template creation.
DeltaFile
+28-175clang-tools-extra/clang-doc/HTMLMustacheGenerator.cpp
+131-0clang-tools-extra/clang-doc/Generators.cpp
+83-0clang-tools-extra/clang-doc/Generators.h
+242-1753 files

LLVM/project acb798ellvm/lib/Target/X86 X86FloatingPoint.cpp

Revert "[X86] Remove Redundant memset Calls"

This reverts commit 4b805e18a50cbe809724c01f32ae203f993820d1.

It turns out the original commit was wrong and these were not just
quieting valgrind down, but actually solving an issue. We now get MSan
failures. Reverting to have some time to investigate.

https://lab.llvm.org/buildbot/#/builders/164/builds/15562
DeltaFile
+6-1llvm/lib/Target/X86/X86FloatingPoint.cpp
+6-11 files

LLVM/project 67d89fcclang/test/Driver hip-temps-windows.hip

Give up on windows test
DeltaFile
+4-3clang/test/Driver/hip-temps-windows.hip
+4-31 files

LLVM/project 4c82cbcclang/test/Driver hip-temps-windows.hip

try fs-sep in env vars
DeltaFile
+1-1clang/test/Driver/hip-temps-windows.hip
+1-11 files

LLVM/project 769c1efcompiler-rt/lib/interception/tests interception_win_test.cpp

[ASan] Fix forward 141c2b

When landing 141c2b I didn't realize that none of these files actually
got built either locally or by premerge. I had some minor syntax
mistakes that caused the build to fail. This patch fixes those issues
and has been verified on a Windows machine.
DeltaFile
+2-2compiler-rt/lib/interception/tests/interception_win_test.cpp
+2-21 files

LLVM/project 0bba1e7mlir/include/mlir/Conversion/ArithToAPFloat ArithToAPFloat.h, mlir/lib/Conversion/ArithToAPFloat ArithToAPFloat.cpp

Reland yet again: [mlir] Add FP software implementation lowering pass: `arith-to-apfloat` (#167608)

Fix both symbol visibility issue in the mlir_apfloat_wrappers lib and the linkage issue in ArithToAPFloat.
DeltaFile
+163-0mlir/lib/Conversion/ArithToAPFloat/ArithToAPFloat.cpp
+128-0mlir/test/Conversion/ArithToApfloat/arith-to-apfloat.mlir
+89-0mlir/lib/ExecutionEngine/APFloatWrappers.cpp
+36-0mlir/test/Integration/Dialect/Arith/CPU/test-apfloat-emulation.mlir
+25-0mlir/lib/Dialect/Func/Utils/Utils.cpp
+21-0mlir/include/mlir/Conversion/ArithToAPFloat/ArithToAPFloat.h
+462-011 files not shown
+552-017 files

LLVM/project 4fc84ffclang/test/Driver hip-temps-windows.hip

Update clang/test/Driver/hip-temps-windows.hip
DeltaFile
+0-1clang/test/Driver/hip-temps-windows.hip
+0-11 files

LLVM/project 645143eclang/test/Driver hip-temps-linux.hip hip-temps-windows.hip

Revert "try not using \ to break long run lines"

This reverts commit ac138ba8e5a9861e7dda74ee102575f1c5cfeb3b.
DeltaFile
+4-1clang/test/Driver/hip-temps-linux.hip
+4-1clang/test/Driver/hip-temps-windows.hip
+8-22 files

LLVM/project 1d3d1f7clang/test/Driver hip-temps-linux.hip hip-temps-windows.hip

UNSUPPORTED: system-windows for now
DeltaFile
+1-0clang/test/Driver/hip-temps-linux.hip
+1-0clang/test/Driver/hip-temps-windows.hip
+2-02 files

LLVM/project 3ea2f06clang/test/Driver hip-temps-linux.hip hip-temps-windows.hip

Revert "try unquoting and repeat env"

This reverts commit 9aa8b8e20f9b0912b10194d5aae89c42e4437fae.
DeltaFile
+1-1clang/test/Driver/hip-temps-linux.hip
+1-1clang/test/Driver/hip-temps-windows.hip
+2-22 files

LLVM/project f90ea57clang/test/Driver hip-temps-linux.hip hip-temps-windows.hip

try not using \ to break long run lines
DeltaFile
+1-4clang/test/Driver/hip-temps-linux.hip
+1-4clang/test/Driver/hip-temps-windows.hip
+2-82 files

LLVM/project 7bd658aclang/test/Driver hip-temps-linux.hip hip-temps-windows.hip

try unquoting and repeat env
DeltaFile
+1-1clang/test/Driver/hip-temps-linux.hip
+1-1clang/test/Driver/hip-temps-windows.hip
+2-22 files

LLVM/project af517a6clang/test/Driver hip-temps-linux.hip hip-temps-windows.hip

clang: Remove requires system from hip driver tests
DeltaFile
+3-4clang/test/Driver/hip-temps-linux.hip
+3-4clang/test/Driver/hip-temps-windows.hip
+6-82 files