LLVM/project 5af27f8clang/lib/CodeGen CodeGenPGO.cpp, clang/test/Profile c-collision.c

[InstrProf] Fix frontend generated function hash (#165358)

DeltaFile
+6-6clang/test/Profile/Inputs/c-general.proftext
+7-3llvm/include/llvm/ProfileData/InstrProf.h
+7-2clang/lib/CodeGen/CodeGenPGO.cpp
+3-2llvm/lib/ProfileData/InstrProf.cpp
+2-2clang/test/Profile/c-collision.c
+2-2clang/test/Profile/Inputs/c-unprofiled-blocks.proftext
+27-1712 files not shown
+38-2718 files

LLVM/project 856bcd3llvm/lib/CodeGen TwoAddressInstructionPass.cpp, llvm/lib/Target/AMDGPU SIInstrInfo.cpp

CodeGen/AMDGPU: Allow 3-address conversion of bundled instructions

This is in preparation for future changes in AMDGPU that will make more
substantial use of bundles pre-RA. For now, simply test this with
degenerate (single-instruction) bundles.

commit-id:4a30cb78
DeltaFile
+53-5llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+31-23llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+4-5llvm/test/CodeGen/AMDGPU/twoaddr-bundle.mir
+88-333 files

LLVM/project 3eb0dc1llvm/lib/CodeGen TwoAddressInstructionPass.cpp, llvm/test/CodeGen/AMDGPU twoaddr-bundle.mir

CodeGen: Handle bundled instructions in two-address-instructions pass

If the instruction with tied operands is a BUNDLE instruction and we
handle it by replacing an operand, then we need to update the
corresponding internal operands as well. Otherwise, the resulting MIR is
invalid.

The test case is degenerate in the sense that the bundle only contains a
single instruction, but it is sufficient to exercise this issue.

commit-id:6760a9b7
DeltaFile
+57-0llvm/test/CodeGen/AMDGPU/twoaddr-bundle.mir
+11-0llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+68-02 files

LLVM/project d1387edllvm/lib/CodeGen MachineInstr.cpp, llvm/test/CodeGen/SystemZ vec-load-element.ll

CodeGen: More accurate mayAlias for instructions with multiple MMOs (#166211)

There can only be meaningful aliasing between the memory accesses of
different instructions if at least one of the accesses modifies memory.

This check is applied at the instruction-level earlier in the method.
This change merely extends the check on a per-MMO basis.

This affects a SystemZ test because PFD instructions are both mayLoad
and mayStore but may carry a load-only MMO which is now no longer
treated as aliasing loads. The PFD instructions are from llvm.prefetch
generated by loop-data-prefetch.
DeltaFile
+6-2llvm/lib/CodeGen/MachineInstr.cpp
+2-2llvm/test/CodeGen/SystemZ/vec-load-element.ll
+8-42 files

LLVM/project 70f5fd4flang/lib/Optimizer/Transforms DebugTypeGenerator.cpp, flang/test/Integration debug-proc-ptr-e2e.f90

[flang][debug] Add debug type support for procedure pointers (#166764)

Fixes #161223

Procedure pointers in Fortran were generating incorrect debug type
information, showing as 'integer' in GDB instead of the actual procedure
signature.
DeltaFile
+41-0flang/test/Transforms/debug-proc-ptr.fir
+26-0flang/test/Integration/debug-proc-ptr-e2e.f90
+25-0flang/lib/Optimizer/Transforms/DebugTypeGenerator.cpp
+92-03 files

LLVM/project 210b9a5.github/workflows llvm-abi-tests.yml libcxx-build-and-test.yaml

[Github] Update GitHub Artifact Actions (major) (#166112)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[actions/download-artifact](https://redirect.github.com/actions/download-artifact)
| action | major | `v5.0.0` -> `v6.0.0` |
|
[actions/upload-artifact](https://redirect.github.com/actions/upload-artifact)
| action | major | `v4.6.2` -> `v5.0.0` |
|
[actions/upload-artifact](https://redirect.github.com/actions/upload-artifact)
| action | major | `4.6.2` -> `5.0.0` |
DeltaFile
+6-6.github/workflows/llvm-abi-tests.yml
+4-4.github/workflows/libcxx-build-and-test.yaml
+4-4.github/workflows/libclang-abi-tests.yml
+3-3.github/workflows/release-binaries.yml
+2-2.github/workflows/premerge.yaml
+2-2.github/workflows/build-ci-container-tooling.yml
+21-2113 files not shown
+37-3719 files

LLVM/project 50daf4dllvm/docs ReleaseNotes.md

Add @llvm.reloc.none intrinsic to LLVM release notes (#166805)

This declares PR #147427.
DeltaFile
+3-0llvm/docs/ReleaseNotes.md
+3-01 files

LLVM/project 75c09b7llvm/lib/Target/DirectX DXILDataScalarization.cpp, llvm/test/CodeGen/DirectX scalarize-global.ll scalarize-alloca.ll

[DirectX] Let data scalarizer pass account for sub-types when updating GEP type (#166200)

This pr lets the `dxil-data-scalarization` account for a GEP with a
source type that is a sub-type of the pointer operand type.

The pass is updated so that the replaced GEP introduces zero indices
such that the result type remains the same (with the vector -> array
transform).

Please see resolved issue for an annotated example.

Resolves: https://github.com/llvm/llvm-project/issues/165473
DeltaFile
+70-0llvm/test/CodeGen/DirectX/scalarize-global.ll
+52-16llvm/lib/Target/DirectX/DXILDataScalarization.cpp
+65-0llvm/test/CodeGen/DirectX/scalarize-alloca.ll
+187-163 files

LLVM/project 69ff9c0llvm/lib/CodeGen TwoAddressInstructionPass.cpp, llvm/lib/Target/AMDGPU SIInstrInfo.cpp

CodeGen/AMDGPU: Allow 3-address conversion of bundled instructions

This is in preparation for future changes in AMDGPU that will make more
substantial use of bundles pre-RA. For now, simply test this with
degenerate (single-instruction) bundles.

commit-id:4a30cb78
DeltaFile
+53-5llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+31-23llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+4-5llvm/test/CodeGen/AMDGPU/twoaddr-bundle.mir
+88-333 files

LLVM/project b65555bllvm/lib/CodeGen TwoAddressInstructionPass.cpp, llvm/test/CodeGen/AMDGPU twoaddr-bundle.mir

CodeGen: Handle bundled instructions in two-address-instructions pass

If the instruction with tied operands is a BUNDLE instruction and we
handle it by replacing an operand, then we need to update the
corresponding internal operands as well. Otherwise, the resulting MIR is
invalid.

The test case is degenerate in the sense that the bundle only contains a
single instruction, but it is sufficient to exercise this issue.

commit-id:6760a9b7
DeltaFile
+57-0llvm/test/CodeGen/AMDGPU/twoaddr-bundle.mir
+12-0llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
+69-02 files

LLVM/project 83930be.ci generate_test_report_lib.py

[CI] Ensure compatibility with Python 3.8

55436aeb2e8275d803a0e1bdff432717a1cf86b5 broke this on Windows as we
only use Python 3.9 there, but the construct is only supported from 3.10
onwards. Use the old Optional type to ensure compatibility.
DeltaFile
+4-2.ci/generate_test_report_lib.py
+4-21 files

LLVM/project 6ac4585clang/include/clang/AST JSONNodeDumper.h, clang/lib/AST JSONNodeDumper.cpp

[clang][AST] Do not try to handle irrelevant cases in writeBareSourceLocation (#166588)

`writeBareSourceLocation` is always called on either `Expanded` or
`Spelling` location, in any on those cases the
`SM.getSpellingLineNumber(Loc) == SM.getExpansionLineNumber(Loc) ==
SM.getLineNumber(Loc)`.
DeltaFile
+9-12clang/lib/AST/JSONNodeDumper.cpp
+1-1clang/include/clang/AST/JSONNodeDumper.h
+10-132 files

LLVM/project 7e44989.ci premerge_advisor_explain.py

docs

Created using spr 1.3.7
DeltaFile
+25-0.ci/premerge_advisor_explain.py
+25-01 files

LLVM/project ba4abc6llvm/include/llvm/Support Casting.h

[Support] Fix up cast function object definitions. NFC. (#166789)

The template arguments specify the *target* type, not the *source* type.
DeltaFile
+8-8llvm/include/llvm/Support/Casting.h
+8-81 files

LLVM/project 5f08fb4llvm/docs LangRef.rst, llvm/lib/CodeGen/SelectionDAG SelectionDAGBuilder.cpp

[IR] llvm.reloc.none intrinsic for no-op symbol references (#147427)

This intrinsic emits a BFD_RELOC_NONE relocation at the point of call,
which allows optimizations and languages to explicitly pull in symbols
from static libraries without there being any code or data that has an
effectual relocation against such a symbol.

See issue #146159 for context.
DeltaFile
+5,620-5,619llvm/test/tools/llvm-ir2vec/output/reference_x86_entities.txt
+26-26llvm/test/tools/llvm-ir2vec/output/reference_triplets.txt
+31-0llvm/docs/LangRef.rst
+14-0llvm/test/CodeGen/X86/GlobalISel/reloc-none.ll
+13-0llvm/test/Verifier/reloc-none.ll
+11-0llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+5,715-5,64515 files not shown
+5,779-5,64721 files

LLVM/project 71927ddllvm/include/llvm/CodeGen Analysis.h, llvm/lib/CodeGen/GlobalISel CallLowering.cpp

[CodeGen] Delete two ComputeValueVTs overloads (NFC) (#166758)

Those have only a few uses.
DeltaFile
+1-15llvm/include/llvm/CodeGen/Analysis.h
+4-3llvm/lib/CodeGen/GlobalISel/CallLowering.cpp
+2-2llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
+2-1llvm/lib/Target/AMDGPU/AMDGPUISelLowering.cpp
+2-1llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp
+11-225 files

LLVM/project bda7289lldb/source/Plugins/ExpressionParser/Clang ClangModulesDeclVendor.h

[lldb][docs][NFC] Fix ClangModulesDeclVendor::AddModule parameter docs
DeltaFile
+4-4lldb/source/Plugins/ExpressionParser/Clang/ClangModulesDeclVendor.h
+4-41 files

LLVM/project 533acb2clang/test/CodeGen/AArch64 v9.6a-neon-f16-intrinsics.c v9.6a-neon-f32-intrinsics.c, llvm/lib/Target/AArch64 AArch64InstrFormats.td AArch64InstrInfo.td

fixup! [AArch64][llvm] Add support for vmmlaq_[f16,f32]_mf8 intrinsics

Add extra lowering
DeltaFile
+20-0llvm/lib/Target/AArch64/AArch64InstrFormats.td
+14-0llvm/test/CodeGen/AArch64/aarch64-matmul-fp16.ll
+13-0llvm/test/CodeGen/AArch64/aarch64-matmul-fp32.ll
+11-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+2-4clang/test/CodeGen/AArch64/v9.6a-neon-f16-intrinsics.c
+2-4clang/test/CodeGen/AArch64/v9.6a-neon-f32-intrinsics.c
+62-81 files not shown
+64-107 files

LLVM/project f75e04eclang/include/clang/Basic arm_neon.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

[AArch64][llvm] Add support for vmmlaq_[f16,f32]_mf8 intrinsics

Add support for the following new intrinsics:
```
float16x8_t vmmlaq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t, fpm_t);
float32x4_t vmmlaq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t);
```
DeltaFile
+26-1clang/test/CodeGen/AArch64/v8.6a-neon-intrinsics.c
+8-0clang/include/clang/Basic/arm_neon.td
+8-0clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+6-0llvm/include/llvm/IR/IntrinsicsAArch64.td
+48-14 files

LLVM/project 846648dclang/test/CodeGen/AArch64 v9.6a-neon-intrinsics.c v9.6a-neon-f16-intrinsics.c

fixup! [AArch64][llvm] Add support for vmmlaq_[f16,f32]_mf8 intrinsics

Split testcase files
DeltaFile
+0-39clang/test/CodeGen/AArch64/v9.6a-neon-intrinsics.c
+25-0clang/test/CodeGen/AArch64/v9.6a-neon-f16-intrinsics.c
+23-0clang/test/CodeGen/AArch64/v9.6a-neon-f32-intrinsics.c
+48-393 files

LLVM/project 4b5fa3aclang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/test/CodeGen/AArch64 v9.6a-neon-intrinsics.c v8.6a-neon-intrinsics.c

fixup! [AArch64][llvm] Add support for vmmlaq_[f16,f32]_mf8 intrinsics

Fix CR comments; don't create a new intrinsic, and split test files
DeltaFile
+39-0clang/test/CodeGen/AArch64/v9.6a-neon-intrinsics.c
+1-26clang/test/CodeGen/AArch64/v8.6a-neon-intrinsics.c
+6-4clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1-6llvm/include/llvm/IR/IntrinsicsAArch64.td
+47-364 files

LLVM/project faae161mlir/include/mlir/Dialect/OpenACC OpenACCOps.td, mlir/lib/Dialect/OpenACC/IR OpenACC.cpp

[mlir][acc] Erase empty kernel_environment ops during canonicalization (#166633)

This change removes empty `acc.kernel_environment` operations during
canonicalization. This could happen when the acc compute construct
inside the `acc.kernel_environment` is optimized away in cases such as
when only private variables are being written to in the loop.

In cases of empty `acc.kernel_environment` ops with waitOperands, we
still remove the empty `acc.kernel_environment`, but also create an
`acc.wait` operation to take those wait operands to preserve
synchronization behavior.
DeltaFile
+68-0mlir/lib/Dialect/OpenACC/IR/OpenACC.cpp
+27-0mlir/test/Dialect/OpenACC/canonicalize.mlir
+2-0mlir/include/mlir/Dialect/OpenACC/OpenACCOps.td
+97-03 files

LLVM/project 429dfd7llvm/lib/CodeGen AtomicExpandPass.cpp, llvm/test/CodeGen/ARM atomic-load-store.ll

[AtomicExpand] Add bitcasts when expanding load atomic vector

AtomicExpand fails for aligned `load atomic <n x T>` because it
does not find a compatible library call. This change adds appropriate
bitcasts so that the call can be lowered. It also adds support for
128 bit lowering in tablegen to support SSE/AVX.
DeltaFile
+90-1llvm/test/CodeGen/X86/atomic-load-store.ll
+66-0llvm/test/Transforms/AtomicExpand/X86/expand-atomic-non-integer.ll
+51-0llvm/test/CodeGen/ARM/atomic-load-store.ll
+18-4llvm/lib/CodeGen/AtomicExpandPass.cpp
+225-54 files

LLVM/project 4d1cdadllvm/lib/Target/X86 X86InstrCompiler.td X86ISelLowering.cpp, llvm/test/CodeGen/X86 atomic-load-store.ll

[X86] Cast atomic vectors in IR to support floats

This commit casts floats to ints in an atomic load during AtomicExpand to support
floating point types. It also is required to support 128 bit vectors in SSE/AVX.
DeltaFile
+98-287llvm/test/CodeGen/X86/atomic-load-store.ll
+15-0llvm/lib/Target/X86/X86InstrCompiler.td
+7-0llvm/lib/Target/X86/X86ISelLowering.cpp
+2-0llvm/lib/Target/X86/X86ISelLowering.h
+122-2874 files

LLVM/project 670c453offload/plugins-nextgen/common/include PluginInterface.h, offload/plugins-nextgen/common/src PluginInterface.cpp

[Offload] Remove handling for device memory pool (#163629)

Summary:
This was a lot of code that was only used for upstream LLVM builds of
AMDGPU offloading. We have a generic and fast `malloc` in `libc` now so
just use that. Simplifies code, can be added back if we start providing
alternate forms but I don't think there's a single use-case that would
justify it yet.
DeltaFile
+0-86offload/plugins-nextgen/common/src/PluginInterface.cpp
+34-33openmp/device/src/Allocator.cpp
+42-0offload/test/libc/malloc_parallel.c
+0-42offload/test/offloading/malloc_parallel.c
+2-22openmp/device/src/State.cpp
+9-14offload/plugins-nextgen/common/include/PluginInterface.h
+87-19710 files not shown
+92-25716 files

LLVM/project 96c7644flang/test/Lower/OpenACC acc-reduction.f90, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll memintrinsic-unroll.ll

rebase

Created using spr 1.3.7
DeltaFile
+3,918-4,188llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,174-719flang/test/Lower/OpenACC/acc-reduction.f90
+831-980llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+886-884llvm/test/CodeGen/AMDGPU/scratch-simple.ll
+464-642llvm/test/CodeGen/WebAssembly/memory-interleave.ll
+520-567llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+7,793-7,980566 files not shown
+24,444-14,397572 files

LLVM/project aa55c55flang/test/Lower/OpenACC acc-reduction.f90, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll memintrinsic-unroll.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+3,918-4,188llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,174-719flang/test/Lower/OpenACC/acc-reduction.f90
+831-980llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+886-884llvm/test/CodeGen/AMDGPU/scratch-simple.ll
+464-642llvm/test/CodeGen/WebAssembly/memory-interleave.ll
+520-567llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+7,793-7,980566 files not shown
+24,444-14,397572 files

LLVM/project 7e90d65flang/test/Lower/OpenACC acc-reduction.f90, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll memintrinsic-unroll.ll

rebase

Created using spr 1.3.7
DeltaFile
+3,918-4,188llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,174-719flang/test/Lower/OpenACC/acc-reduction.f90
+831-980llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+886-884llvm/test/CodeGen/AMDGPU/scratch-simple.ll
+464-642llvm/test/CodeGen/WebAssembly/memory-interleave.ll
+520-567llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+7,793-7,980566 files not shown
+24,444-14,397572 files

LLVM/project 0d66b7eflang/test/Lower/OpenACC acc-reduction.f90, llvm/test/CodeGen/AMDGPU amdgcn.bitcast.1024bit.ll memintrinsic-unroll.ll

[𝘀𝗽𝗿] changes introduced through rebase

Created using spr 1.3.7

[skip ci]
DeltaFile
+3,918-4,188llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.1024bit.ll
+1,174-719flang/test/Lower/OpenACC/acc-reduction.f90
+831-980llvm/test/CodeGen/AMDGPU/memintrinsic-unroll.ll
+886-884llvm/test/CodeGen/AMDGPU/scratch-simple.ll
+464-642llvm/test/CodeGen/WebAssembly/memory-interleave.ll
+520-567llvm/test/CodeGen/AMDGPU/amdgcn.bitcast.960bit.ll
+7,793-7,980566 files not shown
+24,444-14,397572 files

LLVM/project 527b7a4.ci generate_test_report_github.py generate_test_report_lib.py

[CI][NFC] Refactor compute_platform_title into generate_test_report_lib

This enables reuse in other CI components, like
premerge_advisor_explain.py.

Reviewers: DavidSpickett, gburgessiv, Keenuts, dschuff, lnihlen

Reviewed By: Keenuts, DavidSpickett

Pull Request: https://github.com/llvm/llvm-project/pull/166604
DeltaFile
+3-12.ci/generate_test_report_github.py
+11-0.ci/generate_test_report_lib.py
+14-122 files