LLVM/project 14375d4clang/test/Driver darwin-header-search-libcxx-2.cpp rocm-detect-lib-llvm.hip

[clang] Clean up large clang binaries copied into test temp directories (#182304)

I noticed a couple of tests leave behind copies of clang binaries they
copy into their temp directories. Replicate the cleanup from another
test (clang/test/Driver/clang_f_opts_withspaces.c) to remove these.
DeltaFile
+3-0clang/test/Driver/darwin-header-search-libcxx-2.cpp
+3-0clang/test/Driver/rocm-detect-lib-llvm.hip
+6-02 files

LLVM/project 8844402llvm/test/CodeGen/Hexagon/autohvx isel-hvx-rescale-predicate.ll

Add REQUIRES to test
DeltaFile
+1-0llvm/test/CodeGen/Hexagon/autohvx/isel-hvx-rescale-predicate.ll
+1-01 files

LLVM/project 6e5cc82llvm/lib/Analysis LoopAccessAnalysis.cpp, llvm/test/Transforms/LoopVectorize pointer-induction.ll

 [LAA][LV]Allow recognition of strided pointers with constant stride (#171151)

This patch fixes an issue found during LoopAccessAnalysis with respect
to recognizing strided pointers that make use of runtime constants. Loop
accesses of the form `p[base + offset * const]` , where `const` is a
runtime constant
should be considered for vectorization. However, it was found that there
were cases that these access patterns weren't recognized. This patch
resolves
this by adding an explicit pattern match within LAA.

---------

Co-authored-by: Florian Hahn <flo at fhahn.com>
DeltaFile
+96-0llvm/test/Transforms/LoopVectorize/pointer-induction.ll
+2-0llvm/lib/Analysis/LoopAccessAnalysis.cpp
+98-02 files

LLVM/project 6d53bc3llvm/lib/Target/Hexagon HexagonISelLoweringHVX.cpp

format
DeltaFile
+2-3llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
+2-31 files

LLVM/project 251e9c4libsycl/src/detail global_objects.cpp

[libsycl] Fix build after global dtor change (#183077)

Fixes build after https://github.com/llvm/llvm-project/pull/181366

---------

Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
DeltaFile
+10-16libsycl/src/detail/global_objects.cpp
+10-161 files

LLVM/project 76699fbllvm/lib/Target/NVPTX NVPTXIntrinsics.td, llvm/test/CodeGen/NVPTX wmma.py wmma-ptx91-sm120a.py

[MLIR][NVVM][NVPTX] Support for new mma/mma.sp variants from PTX 9.1 (#182325)

This change adds support for `.scale_vec::4X` with `.ue8m0` as `.stype`
with `.kind::mxf4nvf4` for `mma/mma.sp` instructions introduced in [PTX
ISA
9.1](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html?highlight=mma%2520sp#ptx-isa-version-9-1).

Also, it updates MLIR mma/mma.sp block scale tests with struct usage
instead of vector.
DeltaFile
+306-168mlir/test/Dialect/LLVMIR/nvvm-mma-sparse-blockscale.mlir
+245-140mlir/test/Dialect/LLVMIR/nvvm-mma-blockscale.mlir
+30-9llvm/lib/Target/NVPTX/NVPTXIntrinsics.td
+24-2llvm/test/CodeGen/NVPTX/wmma.py
+12-0llvm/test/CodeGen/NVPTX/wmma-ptx91-sm120a.py
+12-0llvm/test/CodeGen/NVPTX/wmma-ptx91-sm120f.py
+629-3194 files not shown
+649-32510 files

LLVM/project 25dd013clang/lib/CIR/CodeGen CIRGenStmt.cpp CIRGenFunction.h, clang/test/CIR/CodeGen assume-attr.cpp

[CIR] Implement 'assume' attribute lowering (#182960)

This attribute applies to null statements and emits an assume-op
sometimes. This patch adds this for statements, which includes the
infrastructure for AttributedStmt lowering.

---------

Co-authored-by: Sirui Mu <msrlancern at gmail.com>
DeltaFile
+91-0clang/test/CIR/CodeGen/assume-attr.cpp
+33-0clang/lib/CIR/CodeGen/CIRGenStmt.cpp
+2-0clang/lib/CIR/CodeGen/CIRGenFunction.h
+126-03 files

LLVM/project 8854dd0llvm/docs LangRef.rst

[IR] Specify alloca with poison element count (#183072)

An alloca with a poison element count is undefined behavior.

This matches existing behavior of optimizations. This also matches the
behavior of llubui for `poison`, but llubi currently does not report
immediate UB for `undef` element counts. A future patch will fix that.
DeltaFile
+3-0llvm/docs/LangRef.rst
+3-01 files

LLVM/project 0e93dd4llvm/lib/Target/Hexagon HexagonISelLoweringHVX.cpp HexagonISelLowering.h, llvm/test/CodeGen/Hexagon/autohvx isel-hvx-rescale-predicate.ll

[Hexagon] Avoid contracting predicates in createHvxPrefixPred

The function createHvxPrefixPred should only need to expand a predicate
to match the result's bytes-per-bit. Otherwise, contracting of the
predicate may lead to an input that is shorter than 4 bytes, making it
unsuitable for VINSERTW0.

When calling createHvxPrefixPred for vector concatention, re-group the
inputs to the concat to make sure that the resulting inputs to
createHvxPrefixPred would not need contraction.

Fixes https://github.com/llvm/llvm-project/issues/181362
DeltaFile
+60-30llvm/lib/Target/Hexagon/HexagonISelLoweringHVX.cpp
+13-0llvm/test/CodeGen/Hexagon/autohvx/isel-hvx-rescale-predicate.ll
+4-0llvm/lib/Target/Hexagon/HexagonISelLowering.h
+77-303 files

LLVM/project 7a83886llvm/lib/Target/X86 X86ISelLowering.cpp

[X86] lowerShuffleAsLanePermuteAndRepeatedMask - consistently move V2.isUndef handling inside the function. NFC. (#183069)

Matches other "lane permute + shuffle" functions.
DeltaFile
+8-11llvm/lib/Target/X86/X86ISelLowering.cpp
+8-111 files

LLVM/project 3e2fb2elibsycl/src/detail global_objects.cpp global_objects.hpp

[libsycl] Fix for static vars deinit order (libsycl vs liboffload) (#181366)

both libsycl & liboffload uses static variables. 
on Linux static variable destructor is called earlier than the method
with `__attribute__((destructor(...)))`.
this fix helps to avoid crash due to liboffload static variable early
destruction.

the approach utilizes the following rule
"For each local object obj with static storage duration, obj is
destroyed as if a function calling the destructor of obj were registered
with
[std::atexit](https://en.cppreference.com/w/cpp/utility/program/atexit.html)
at the completion of the constructor of obj."
from `std::exit`.
in the first call of get_platforms we call liboffload's iterateDevices
that leads to liboffload static storage initialization. Then we
initialize our own local static var after this to be able to call our
shutdown methods earlier and before the liboffload objects are

    [8 lines not shown]
DeltaFile
+28-38libsycl/src/detail/global_objects.cpp
+4-0libsycl/src/detail/global_objects.hpp
+2-0libsycl/src/detail/platform_impl.cpp
+34-383 files

LLVM/project 9f2351cclang/lib/CodeGen CodeGenModule.cpp, clang/test/CodeGenCUDA constexpr-variables.cu const-var.cu

[HIP] Do not apply 'externally_initialized' to constant device variables (#182157)

Summary:
From the Language reference:
> By default, global initializers are optimized by assuming that global
variables defined within the module are not modified from their initial
values before the start of the global initializer. This is true even for
variables potentially accessible from outside the module, including
those with external linkage or appearing in @llvm.used or dllexported
variables. This assumption may be suppressed by marking the variable
with externally_initialized.

This is intended because device programs can be modified beyond the
normal lifetime expected by the optimization pipeline. However, for
constant variables we should be able to safely assume that these are
truly constant within the module. In the vast majority of cases these
will not get externally visible symbols, but even `extern const` uses we
should assert that the user should not be writing them if they are
marked const.
DeltaFile
+4-4clang/test/CodeGenCUDA/constexpr-variables.cu
+2-2clang/test/CodeGenCUDA/const-var.cu
+2-2clang/test/CodeGenCUDA/host-used-device-var.cu
+2-1clang/lib/CodeGen/CodeGenModule.cpp
+10-94 files

LLVM/project 600919aclang/lib/Driver/ToolChains Clang.cpp, clang/test/Driver spirv-openmp-toolchain.c

[Offload][clang-linker-wrapper][SPIRV] Tell spirv-link to not optimize out exported symbols (#182930)

`spirv-link` seems to internalize all symbols, which ends up causing the
OpenMP Device Environment global generated by the OMP FE to get
optimized out which causes `liboffload` to run in the wrong
parallelization mode which breaks at least one liboffload lit test.

Pass `--create-library` to tell it not to do that.

```
  --create-library
               Link the binaries into a library, keeping all exported symbols.
```

This fixes the test.

Closes: https://github.com/llvm/llvm-project/issues/182901

Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
DeltaFile
+9-4clang/lib/Driver/ToolChains/Clang.cpp
+2-1clang/test/Driver/spirv-openmp-toolchain.c
+0-1offload/test/offloading/info.c
+11-63 files

LLVM/project 1ecc153clang/test/CIR/CodeGenCXX global-refs.cpp

[CIR] Fix global-refs test that got committed in github 'race' (#183068)

Despite my best efforts, this crossed in the air with the attributes on
arguments patch, and thus had a problem with the test. This patch
updates the test to be tolerant of the attributes.
DeltaFile
+5-5clang/test/CIR/CodeGenCXX/global-refs.cpp
+5-51 files

LLVM/project d71c80ellvm/lib/ProfileData/Coverage CoverageMapping.cpp

Fixup for #125407, suppress CounterExpr errors
DeltaFile
+5-2llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+5-21 files

LLVM/project 2f09fe7mlir/lib/Dialect/Linalg/Transforms BufferizableOpInterfaceImpl.cpp, mlir/test/Dialect/Linalg bufferize.mlir

[mlir][linalg] Implement bufferization for UnPackOp. (#182837)

Signed-off-by: hanhanW <hanhan0912 at gmail.com>
DeltaFile
+39-0mlir/lib/Dialect/Linalg/Transforms/BufferizableOpInterfaceImpl.cpp
+16-0mlir/test/Dialect/Linalg/bufferize.mlir
+55-02 files

LLVM/project 0b6e122llvm/lib/ProfileData/Coverage CoverageMapping.cpp, llvm/test/tools/llvm-cov mcdc-macro.test

[MC/DC] Make covmap tolerant of nested Decisions (#125407)

CoverageMappingWriter reorders `Region`s by `endLoc DESC` to prioritize
wider `Decision` with the same `startLoc`.

In `llvm-cov`, tweak seeking Decisions by reversal order to find smaller
Decision first.

llvmorg-23-init-2321-g8f690ec7ffd8
DeltaFile
+142-175llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
+7-7llvm/test/tools/llvm-cov/mcdc-macro.test
+149-1822 files

LLVM/project 170a99dllvm/tools/llvm-gpu-loader llvm-gpu-loader.cpp

[LLVM] Fix accidentally included POSIX header on Windows

Summary:
This used to only build on Linux, I forgot that these changes would
cause it to be built on Windows.
DeltaFile
+0-1llvm/tools/llvm-gpu-loader/llvm-gpu-loader.cpp
+0-11 files

LLVM/project 1eb8ab7llvm/lib/Target/AMDGPU VOPCInstructions.td, llvm/test/MC/AMDGPU gfx12_asm_vopcx.s gfx12_asm_vopc.s

[AMDGPU] Add VOPC to gfx13 (#182293)

Co-authored-by: Ivan Kosarev <ivan.kosarev at amd.com>
DeltaFile
+1,246-1,232llvm/test/MC/AMDGPU/gfx12_asm_vopcx.s
+737-721llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vopcx.txt
+186-181llvm/lib/Target/AMDGPU/VOPCInstructions.td
+78-50llvm/test/MC/AMDGPU/gfx12_asm_vopc.s
+68-28llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vopc.txt
+4-0llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vopc_dpp16.txt
+2,319-2,21223 files not shown
+2,391-2,21229 files

LLVM/project 0781e65clang/test/CodeGen arm_acle.c builtins-arm64.c, clang/test/Sema/AArch64 pcdphint-atomic-store.c

fixup! Fix more PR comments
DeltaFile
+19-9clang/test/Sema/AArch64/pcdphint-atomic-store.c
+8-6llvm/test/CodeGen/AArch64/pcdphint-atomic-store.ll
+10-0clang/test/CodeGen/arm_acle.c
+0-9llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+5-0clang/test/CodeGen/builtins-arm64.c
+0-4llvm/include/llvm/IR/IntrinsicsAArch64.td
+42-282 files not shown
+44-328 files

LLVM/project 4e7d8e2llvm/include/llvm/IR IntrinsicsAArch64.td, llvm/lib/Target/AArch64 AArch64InstrFormats.td

fixup! remove mayLoad/mayStore as suggested by Kerry
DeltaFile
+0-5llvm/lib/Target/AArch64/AArch64InstrFormats.td
+1-1llvm/include/llvm/IR/IntrinsicsAArch64.td
+1-62 files

LLVM/project 6bb16fcclang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup! Fix issues Kerry raised in PR
DeltaFile
+10-23clang/lib/Sema/SemaARM.cpp
+16-11clang/test/Sema/AArch64/pcdphint-atomic-store.c
+5-12clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1-5clang/include/clang/Basic/DiagnosticSemaKinds.td
+32-514 files

LLVM/project 3d65e32clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/test/CodeGen/AArch64 pcdphint-atomic-store.c

fixup! Ensure stshh always immediately precedes a store instruction
DeltaFile
+82-0llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+50-13clang/test/CodeGen/AArch64/pcdphint-atomic-store.c
+62-0llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp
+20-26clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+15-0llvm/lib/Target/AArch64/AArch64InstrInfo.td
+8-3llvm/lib/Target/AArch64/AArch64InstrFormats.td
+237-423 files not shown
+248-499 files

LLVM/project 1409926clang/include/clang/Basic BuiltinsAArch64.def, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup!

More small issues tidied, and remove gating.
DeltaFile
+6-2clang/test/Sema/AArch64/pcdphint-atomic-store.c
+2-2clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+0-2clang/lib/Headers/arm_acle.h
+1-1clang/lib/Sema/SemaARM.cpp
+1-1clang/test/CodeGen/AArch64/pcdphint-atomic-store.c
+1-1clang/include/clang/Basic/BuiltinsAArch64.def
+11-96 files

LLVM/project b986581clang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup! Fix Kerry's CR comments and add negative test for "must be an integer type"
DeltaFile
+16-6llvm/test/CodeGen/AArch64/pcdphint-atomic-store.ll
+3-7clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+3-3clang/lib/Sema/SemaARM.cpp
+5-0clang/test/Sema/AArch64/pcdphint-atomic-store.c
+3-0clang/include/clang/Basic/DiagnosticSemaKinds.td
+1-1clang/lib/Headers/arm_acle.h
+31-176 files

LLVM/project 29b38cfclang/include/clang/Basic DiagnosticSemaKinds.td, clang/lib/CodeGen/TargetBuiltins ARM.cpp

fixup! Improve error diagnostics, and other cleanups
DeltaFile
+12-0llvm/test/CodeGen/AArch64/pcdphint-atomic-store.ll
+4-2clang/lib/Sema/SemaARM.cpp
+2-1clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+1-1clang/include/clang/Basic/DiagnosticSemaKinds.td
+2-0clang/lib/Headers/arm_acle.h
+1-1clang/test/Sema/AArch64/pcdphint-atomic-store.c
+22-56 files

LLVM/project cd6426dclang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/test/Sema/AArch64 pcdphint-atomic-store.c

fixup!

A few small tidyups
DeltaFile
+7-6clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+4-4llvm/lib/Target/AArch64/AArch64InstrFormats.td
+4-0clang/test/Sema/AArch64/pcdphint-atomic-store.c
+15-103 files

LLVM/project 28d3ae5clang/lib/CodeGen/TargetBuiltins ARM.cpp, clang/lib/Sema SemaARM.cpp

[AArch64][clang][llvm] Add ACLE `stshh` atomic store builtin

Add `__arm_atomic_store_with_stshh` implementation as defined
in the ACLE. Validate that the arguments passed are correct, and
lower it to the stshh intrinsic plus an atomic store with the
allowed orderings.

Gate this on FEAT_PCDPHINT so that availability matches
hardware support for the `STSHH` instruction. Use an i64
immediate and side-effect modeling to satisfy tablegen and decoding.
DeltaFile
+140-0clang/lib/Sema/SemaARM.cpp
+48-0clang/lib/CodeGen/TargetBuiltins/ARM.cpp
+31-0clang/test/CodeGen/AArch64/pcdphint-atomic-store.c
+29-0clang/test/Sema/AArch64/pcdphint-atomic-store.c
+13-0llvm/lib/Target/AArch64/Disassembler/AArch64Disassembler.cpp
+10-2llvm/lib/Target/AArch64/AArch64InstrFormats.td
+271-25 files not shown
+298-211 files

LLVM/project 6738c46llvm/tools/llvm-gpu-loader CMakeLists.txt

[LLVM] Add missing binary format dependency to llvm-gpu-loader
DeltaFile
+2-1llvm/tools/llvm-gpu-loader/CMakeLists.txt
+2-11 files

LLVM/project 3c20913llvm/lib/Analysis IVDescriptors.cpp, llvm/test/Transforms/LoopVectorize reduction-with-invariant-store.ll

[IVDesc] Reject invariant stores  in different blocks.

We cannot compare stores in different blocks.

Fixes https://github.com/llvm/llvm-project/issues/182879
DeltaFile
+55-0llvm/test/Transforms/LoopVectorize/reduction-with-invariant-store.ll
+6-2llvm/lib/Analysis/IVDescriptors.cpp
+61-22 files