LLVM/project b629f86llvm/lib/Target/ARM ARMISelLowering.cpp ARMISelLowering.h, llvm/test/CodeGen/ARM vbits.ll

[ARM] hasAndNot in ARM supports vectors (#193614)

NEON and MVE have vector bic.
DeltaFile
+6-12llvm/test/CodeGen/Thumb2/mve-vselect-constants.ll
+11-0llvm/lib/Target/ARM/ARMISelLowering.cpp
+11-0llvm/test/CodeGen/ARM/vbits.ll
+2-0llvm/lib/Target/ARM/ARMISelLowering.h
+30-124 files

LLVM/project 6b81cdbllvm/lib/Transforms/Vectorize SLPVectorizer.cpp, llvm/test/Transforms/SLPVectorizer/X86 identity-reuses-with-poisons.ll

[SLP]Fix crash in getReorderingData on all-poison reuse-mask slice

When the reuse-shuffle mask is iterated in Sz-sized parts and a part is
entirely PoisonMaskElem, `Val` stays at PoisonMaskElem (-1) and the
subsequent `UsedVals.test(Val)` trips the SmallBitVector OOB assertion.
Bail out of reordering in that case.

Fixes #194315

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194392
DeltaFile
+114-0llvm/test/Transforms/SLPVectorizer/X86/identity-reuses-with-poisons.ll
+3-3llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+117-32 files

LLVM/project c183492clang/lib/Format UnwrappedLineParser.cpp, clang/unittests/Format FormatTestComments.cpp TokenAnnotatorTest.cpp

[clang-format] Recognize more braced initializers (#192299)

new

```C++
a = {x * x, x * x};
```

old

```C++
a = {x * x, x *x};
```

Fixes #57442.

The patch makes the program treat a brace following an equal sign a
braced initializer.


    [30 lines not shown]
DeltaFile
+12-15clang/unittests/Format/FormatTestComments.cpp
+3-3clang/lib/Format/UnwrappedLineParser.cpp
+6-0clang/unittests/Format/TokenAnnotatorTest.cpp
+5-1clang/unittests/Format/FormatTest.cpp
+26-194 files

LLVM/project 5e45150offload/test/offloading ctor_dtor.cpp

[offload][lit] Enable ctor_dtor.cpp on Intel GPUs (#194389)

It was fixed with https://github.com/llvm/llvm-project/pull/192725 and
https://github.com/llvm/llvm-project/pull/192730.

Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
DeltaFile
+0-1offload/test/offloading/ctor_dtor.cpp
+0-11 files

LLVM/project 6cbbea7lldb/include/lldb/Utility StringExtractorGDBRemote.h, lldb/packages/Python/lldbsuite/test/tools/lldb-server gdbremote_testcase.py

[lldb-server] Implement support for MultiBreakpoint packet

This is fairly straightforward, thanks to the helper functions created
in the previous commit.

https://github.com/llvm/llvm-project/pull/192910
DeltaFile
+66-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
+2-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
+2-0lldb/source/Utility/StringExtractorGDBRemote.cpp
+0-1lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+1-0lldb/include/lldb/Utility/StringExtractorGDBRemote.h
+1-0lldb/packages/Python/lldbsuite/test/tools/lldb-server/gdbremote_testcase.py
+72-16 files

LLVM/project 8593524flang/lib/Lower OpenACC.cpp, flang/lib/Semantics resolve-directives.cpp canonicalize-acc.cpp

[flang][semantics][openacc] Allow collapse clauses on do concurrent (#192488)

This PR generalizes the semantic checking for collapse clauses to work
on `do concurrent` and fixes two bugs exposed along the way:
- The first was that `collapse (n)` where n < the number of nested loops
was giving an assertion violation.
- The second was do concurrent index variables were causing an assertion
violation because they hadn't been declared before looking them up.

The lowering is implemented as a TODO which will happen in a following
diff.
DeltaFile
+91-0flang/test/Semantics/OpenACC/acc-loop.f90
+55-21flang/lib/Semantics/resolve-directives.cpp
+33-0flang/test/Lower/OpenACC/Todo/do-loops-to-acc-loops-todo.f90
+9-2flang/lib/Lower/OpenACC.cpp
+0-5flang/lib/Semantics/canonicalize-acc.cpp
+0-2flang/test/Semantics/OpenACC/acc-canonicalization-validity.f90
+188-306 files

LLVM/project 57754e0clang/test/CodeGen/AArch64 v9.7a-neon-mmla-intrinsics.c, clang/test/CodeGen/AArch64/sve-intrinsics acle_sve_mmla-f16.c acle_sve_mmla-bf16.c

[AArch64][clang][llvm] Add ACLE Armv9.7 matrix multiply-accumulate intrinsics (#193017)

Implement new ACLE matrix multiply-accumulate intrinsics for Armv9.7:

```c
  // 16-bit floating-point matrix multiply-accumulate.
  // Only if __ARM_FEATURE_SVE_B16MM
  // Variant also available for _f16 if (__ARM_FEATURE_SVE2p2 && __ARM_FEATURE_F16MM).
  svbfloat16_t svmmla[_bf16](svbfloat16_t zda, svbfloat16_t zn, svbfloat16_t zm);

  // Half-precision matrix multiply accumulating to single-precision instruction.
  // Requires the +f16f32mm architecture extension.
  float32x4_t vmmlaq_f32_f16(float32x4_t r, float16x8_t a, float16x8_t b);

  // Non-widening half-precision matrix multiply instruction.
  // Requires the +f16mm architecture extension.
  float16x8_t vmmlaq_f16_f16(float16x8_t r, float16x8_t a, float16x8_t b);
```
DeltaFile
+45-0clang/test/CodeGen/AArch64/v9.7a-neon-mmla-intrinsics.c
+32-0clang/test/CodeGen/AArch64/sve-intrinsics/acle_sve_mmla-f16.c
+32-0clang/test/Sema/AArch64/arm_sve_non_streaming_only_sve_AND_sve2p2_AND_f16mm.c
+32-0clang/test/Sema/AArch64/arm_sve_non_streaming_only_sve_AND_sve-b16mm.c
+32-0clang/test/CodeGen/AArch64/sve-intrinsics/acle_sve_mmla-bf16.c
+14-1clang/test/Sema/aarch64-neon-target.c
+187-112 files not shown
+275-1218 files

LLVM/project 86be9bcllvm/lib/Target/AMDGPU AMDGPUMCInstLower.cpp SIInstrInfo.cpp

Reapply "AMDGPU: Implement getInstSizeVerifyMode" (#194026)

This reverts commit 72ca372fa7c9029d2b7a77c59a4cc24530e99e43.
DeltaFile
+0-22llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp
+7-0llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
+3-0llvm/lib/Target/AMDGPU/SIInstrInfo.h
+10-223 files

LLVM/project c2ab7f2lldb/source/Plugins/Process/gdb-remote GDBRemoteCommunicationServerLLGS.cpp GDBRemoteCommunicationServerLLGS.h

[lldbremote][NFC] Factor out code handling breakpoint packets (#192915)

This commit extracts the code handling breakpoint packets into a helper
function that can be used by a future implementation of the
MultiBreakpointPacket.

It is meant to be purely NFC.

There are two functions handling breakpoint packets (`handle_Z` and
`handle_z`) with a lot of repeated code. This commit did not attempt to
merge the two, as that would make the diff much larger due to subtle
differences in the error message produced by the two. The only
deduplication done is in the code processing a GDBStoppointType, where a
helper struct (`BreakpointKind`) and function
(`std::optional<BreakpointKind> getBreakpointKind(GDBStoppointType
stoppoint_type)`) was created.

The following PRs are related to the MultiBreakpoint feature:


    [7 lines not shown]
DeltaFile
+128-107lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.cpp
+22-0lldb/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationServerLLGS.h
+150-1072 files

LLVM/project bc7e916llvm/lib/Target/AMDGPU VOP2Instructions.td, llvm/test/CodeGen/AMDGPU v_mac_f16-fpdp-rounding-mode.ll

AMDGPU: Address fixme for v_mac_f16 rounding mode (#194360)

This should use the f16/f64 rounding mode
DeltaFile
+27-0llvm/test/CodeGen/AMDGPU/v_mac_f16-fpdp-rounding-mode.ll
+1-1llvm/lib/Target/AMDGPU/VOP2Instructions.td
+28-12 files

LLVM/project 78eccecmlir/include/mlir/Dialect/LLVMIR NVVMOps.td, mlir/lib/Dialect/LLVMIR/IR NVVMDialect.cpp

[MLIR][NVVM] Add `nvvm.log2` OP (#193789)

Implement `nvvm.log2` with ftz flag
DeltaFile
+16-2mlir/include/mlir/Dialect/LLVMIR/NVVMOps.td
+14-0mlir/test/Dialect/LLVMIR/nvvm-transcendentals.mlir
+14-0mlir/test/Target/LLVMIR/nvvm/transcendentals.mlir
+10-0mlir/lib/Dialect/LLVMIR/IR/NVVMDialect.cpp
+54-24 files

LLVM/project 5ee1495clang/docs ReleaseNotes.rst, clang/lib/Parse ParseDeclCXX.cpp

[Clang] fix parser recovery for invalid static_assert string messages (#187859)

Fixes #187690

--- 

This PR fixes parser recovery for invalid `static_assert` declarations
with string literal messages. The parser now stops the message lookahead
on `;` and `eof`, so invalid inputs are diagnosed as parse errors.
DeltaFile
+12-8clang/lib/Parse/ParseDeclCXX.cpp
+11-5clang/test/Parser/static_assert.cpp
+1-0clang/docs/ReleaseNotes.rst
+24-133 files

LLVM/project a94c116llvm/test/Transforms/GlobalOpt ctor-memset.ll pr54572.ll

[GlobalOpt] Regenerate test checks (NFC) (#194385)
DeltaFile
+12-12llvm/test/Transforms/GlobalOpt/ctor-memset.ll
+2-2llvm/test/Transforms/GlobalOpt/pr54572.ll
+14-142 files

LLVM/project 8e0011allvm/test/Transforms/FunctionAttrs nosync.ll

[FunctionAttrs] Remove declaration check lines (NFC) (#194384)

These are annoying, because they get dropped by UTC. We're not
inferring attributes on declarations anyway.
DeltaFile
+0-6llvm/test/Transforms/FunctionAttrs/nosync.ll
+0-61 files

LLVM/project 314c655lldb/test/API/functionalities/multi-breakpoint TestMultiBreakpoint.py main.c, lldb/tools/debugserver/source RNBRemote.cpp JSON.h

[debugserver] Implement MultiBreakpoint (#192914)

This implements the packet as described in
https://github.com/llvm/llvm-project/pull/192910

The following PRs are related to the MultiBreakpoint feature:

* https://github.com/llvm/llvm-project/pull/192910
* https://github.com/llvm/llvm-project/pull/192914
* https://github.com/llvm/llvm-project/pull/192915
* https://github.com/llvm/llvm-project/pull/192919
* https://github.com/llvm/llvm-project/pull/192962
* https://github.com/llvm/llvm-project/pull/192964
* https://github.com/llvm/llvm-project/pull/192971
* https://github.com/llvm/llvm-project/pull/192988
DeltaFile
+204-0lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+71-0lldb/tools/debugserver/source/RNBRemote.cpp
+7-0lldb/test/API/functionalities/multi-breakpoint/main.c
+3-0lldb/test/API/functionalities/multi-breakpoint/Makefile
+2-0lldb/tools/debugserver/source/JSON.h
+2-0lldb/tools/debugserver/source/RNBRemote.h
+289-06 files

LLVM/project dd383b4llvm/utils/lit/lit TestingConfig.py

[z/OS] Add passing env vars in lit on z/OS. (#194017)

This PR adds passing environment variables in lit/TestingConfig.py on
z/OS.
DeltaFile
+8-0llvm/utils/lit/lit/TestingConfig.py
+8-01 files

LLVM/project 555140fllvm/utils/lit/tests shtest-ulimit-nondarwin.py

[z/OS] Mark shtest-ulimit-nondarwin.py unsupported on zos. (#194016)

This PR marks llvm/utils/lit/tests/shtest-ulimit-nondarwin.py
unsupported on z/OS.
DeltaFile
+1-1llvm/utils/lit/tests/shtest-ulimit-nondarwin.py
+1-11 files

LLVM/project 2abeafclldb/test/API/functionalities/multi-breakpoint TestMultiBreakpoint.py

fixup! darker formatting
DeltaFile
+1-1lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+1-11 files

LLVM/project 97dc0fcllvm/include/llvm/Transforms/IPO Attributor.h, llvm/lib/Transforms/IPO Attributor.cpp AttributorAttributes.cpp

[Attributor] Support SPIR-V address spaces (#192725)

Right now Attributor assumes that if the the target is a GPU is can use
a single set of address space numerical values to determine the local
address space, but that's not true in general, so add SPIR-V support,
which uses different values.

This fixes an instruction incorrectly being marked as dead and optimized
out for an OpenMP SPIR-V offloading example.

---------

Signed-off-by: Nick Sarnie <nick.sarnie at intel.com>
DeltaFile
+77-6llvm/lib/Transforms/IPO/Attributor.cpp
+32-19llvm/include/llvm/Transforms/IPO/Attributor.h
+43-0llvm/test/Transforms/OpenMP/spirv_ctor.ll
+10-13llvm/lib/Transforms/IPO/AttributorAttributes.cpp
+162-384 files

LLVM/project 77e1275lldb/test/API/functionalities/multi-breakpoint TestMultiBreakpoint.py

fixup! add comment on test
DeltaFile
+1-0lldb/test/API/functionalities/multi-breakpoint/TestMultiBreakpoint.py
+1-01 files

LLVM/project 0a58d0cclang/test/AST/ByteCode cxx17.cpp, clang/test/SemaCXX cxx17-compat.cpp

[SystemZ] z/OS only accept C initialization (#194023)

The TLS support only accept compile constant expressions (both C and
C++) on z/OS. Add #if to skip these tests on z/OS.
DeltaFile
+2-0clang/test/AST/ByteCode/cxx17.cpp
+2-0clang/test/SemaCXX/cxx17-compat.cpp
+4-02 files

LLVM/project 4e6d372llvm/lib/CodeGen/LiveDebugValues VarLocBasedImpl.cpp

[LiveDebugValues] Use std::sort for register sorting in collectIDsForRegs (#194339)

VarLocBasedLDV::collectIDsForRegs sorts a SmallVector<Register> using
array_pod_sort which is a thin wrapper around qsort. That shows up as a hotspot
in compile-time profiles under __GI___qsort_r.

Switching this to an explicit-comparator llvm::sort call, which takes the
std::sort path instead improves compile-time with no change to code-size.

CTMark geomean:
- stage1-O0-g: -0.41%
- stage1-aarch64-O0-g: -0.58%
- stage2-O0-g: -0.40%

http://llvm-compile-time-tracker.com/compare.php?from=347aa3f6fbcc48cd752d02aa581b74c33d18dd41&to=cca8df56a576682510733c4c1b6fc12556e2dd7c&stat=instructions%3Au
DeltaFile
+1-1llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
+1-11 files

LLVM/project c6de992clang/test/CodeGen/AArch64/sve2p3-intrinsics acle_sve2p3_subp.c acle_sve2p3_addqp.c, clang/test/Sema/AArch64 arm_sve_feature_dependent_sve_AND_LP_sve2p3_OR_sme2p3_RP___sme_AND_LP_sve2p3_OR_sme2p3_RP.c

[Clang][AArch64][SVE2p3][SME2p3] Add intrinsics for v9.7a add/add-and-subtract/subtract pairwise operations (#187527)

Add the following new clang intrinsics based on the ACLE specification
https://github.com/ARM-software/acle/pull/428 (Add alpha support for 9.7
data processing intrinsics)

- ADDQP (Add pairwise within quadword vector segments)
- svint8_t svaddqp_s8(svint8_t, svint8_t) / svint8_t svaddqp(svint8_t,
svint8_t)
- svuint8_t svaddqp_u8(svuint8_t, svuint8_t) / svuint8_t
svaddqp(svuint8_t, svuint8_t)
- svint16_t svaddqp_s16(svint16_t, svint16_t) / svint16_t
svaddqp(svint16_t, svint16_t)
- svuint16_t svaddqp_u16(svuint16_t, svuint16_t) / svuint16_t
svaddqp(svuint16_t, svuint16_t)
- svint32_t svaddqp_s32(svint32_t, svint32_t) / svint32_t
svaddqp(svint32_t, svint32_t)
- svuint32_t svaddqp_u32(svuint32_t, svuint32_t) / svuint32_t
svaddqp(svuint32_t, svuint32_t)

    [39 lines not shown]
DeltaFile
+928-0clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2p3_subp.c
+265-0clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2p3_addqp.c
+265-0clang/test/CodeGen/AArch64/sve2p3-intrinsics/acle_sve2p3_addsubp.c
+241-0clang/test/Sema/AArch64/arm_sve_feature_dependent_sve_AND_LP_sve2p3_OR_sme2p3_RP___sme_AND_LP_sve2p3_OR_sme2p3_RP.c
+40-0llvm/test/CodeGen/AArch64/sve2p3-intrinsics/sve2p3-intrinsics-addqp.ll
+40-0llvm/test/CodeGen/AArch64/sve2p3-intrinsics/sve2p3-intrinsics-addsubp.ll
+1,779-04 files not shown
+1,839-310 files

LLVM/project 7473478clang/lib/CodeGen CGHLSLRuntime.cpp

[Clang][HLSL] Fix -Wunused-variable (#194374)

Inline the variable definition into the assert given it is side effect
free and the variable name does not make the code much more clear.
DeltaFile
+2-3clang/lib/CodeGen/CGHLSLRuntime.cpp
+2-31 files

LLVM/project 0193af4offload/plugins-nextgen/amdgpu/src rtl.cpp, offload/plugins-nextgen/cuda/src rtl.cpp

[offload] Fix use of AsyncInfoWrapper's finalize function (#194098)

The expected use is to forward the error from the asynchronous
operation's issuing (e.g., launchImpl) directly into the
AsyncInfoWrapper::finalize(). The check of the error is already
performed inside that function. No need to forward a dummy success error
code.
DeltaFile
+3-6offload/plugins-nextgen/amdgpu/src/rtl.cpp
+3-5offload/plugins-nextgen/cuda/src/rtl.cpp
+1-4offload/plugins-nextgen/level_zero/src/L0Device.cpp
+7-153 files

LLVM/project 40a303dllvm/lib/ExecutionEngine/Interpreter Execution.cpp, llvm/test/ExecutionEngine/Interpreter test-interp-variable-arguments.ll

[llvm][lli] fix lli crash when run variable arguments function as a interpret (#173719)

Run `lli` comand with the flag `-force-interpreter=true` to execute LLVM
bitcode, if `lli` run `variable arguments` function in the bitcode, it
will crash.

Fix #173718
DeltaFile
+24-0llvm/test/ExecutionEngine/Interpreter/test-interp-variable-arguments.ll
+5-3llvm/lib/ExecutionEngine/Interpreter/Execution.cpp
+29-32 files

LLVM/project d9fd915libsycl/include/sycl/__impl queue.hpp

fix comment

Signed-off-by: Tikhomirova, Kseniya <kseniya.tikhomirova at intel.com>
DeltaFile
+2-2libsycl/include/sycl/__impl/queue.hpp
+2-21 files

LLVM/project 4f50fe9llvm/lib/Target/AMDGPU VOPDInstructions.td, llvm/test/MC/Disassembler/AMDGPU gfx12_dasm_vopd_unused_operands.txt

[AMDGPU][MC] Permit unneeded VOPD mov operands to be non-zero (#194060)

Use ? instead of 0 in the tablegen definitions for VOPD containing
v_mov. This enables the instruction to be disassembled regardless of
what bits are in those fields, which helps diagnose broken code.
Previously, the disassembler would reject these.
DeltaFile
+25-0llvm/test/MC/Disassembler/AMDGPU/gfx12_dasm_vopd_unused_operands.txt
+4-4llvm/lib/Target/AMDGPU/VOPDInstructions.td
+29-42 files

LLVM/project cc2b2f5libc/docs CMakeLists.txt, libc/docs/headers index.rst

[libc][docs] Add sys/sem.h POSIX header documentation (#122006) (#194358)

Add sys/sem.h implementation-status docs to llvm-libc.
DeltaFile
+25-0libc/utils/docgen/sys/sem.yaml
+1-0libc/docs/CMakeLists.txt
+1-0libc/docs/headers/index.rst
+27-03 files

LLVM/project 06ddfcflibunwind/src AddressSpace.hpp DwarfParser.hpp

[libunwind] fix build errors on x32 and mips n32 (#194310)
DeltaFile
+1-1libunwind/src/AddressSpace.hpp
+1-1libunwind/src/DwarfParser.hpp
+2-22 files