[AArch64] Decompose FADD reductions with known zero elements (#167313)
FADDV is matched into FADDPv4f32 + FADDPv2i32p but this can be relaxed
when one element (usually the 4th) or more are known to be zero.
Before:
```
movi d1, #0000000000000000
mov v0.s[3], v1.s[0]
faddp v0.4s, v0.4s, v0.4s
faddp s0, v0.2s
```
After:
```
mov s1, v0.s[2]
faddp s0, v0.2s
fadd s0, s0, s1
```
[2 lines not shown]
Initial import of devel/zycore-c version 1.5.1.
Internal library for zydis disassembler providing platform independent
types, macros and a fallback for environments without LibC.
[mlir][xegpu] Retain order attribute during load + transpose optimization. (#183608)
As described in the title `order` attribute is ignored in this
transformation causing downstream test failures.
[VPlan] Process instructions in reverse order when widening
It doesn't matter right now because we're using CM's decision, but
https://github.com/llvm/llvm-project/pull/182595 introduces some
scalarization (first-lane-only) opportunites that aren't known in CM and
those require reverse iteration order to support as those are determined
by VPUsers and not operands.
[Hexagon] Fix memory type for vgather intrinsics (#183563)
Some of the Hexagon vgather intrinsics were picking the memory type
(memVT) from a fixed argument position, but for several variants (e.g.
the predicated ones), that argument isn’t actually the data vector being
gathered. As a result, LLVM could end up recording the wrong memory type
or size (e.g. i32 or mask instead of the vector arg). This patch fixes
that by always taking memVT from the last intrinsic argument, which is
the actual data vector.
LinuxKPI: remove dumm header now in common
page_pool/helpers.h does exist in common/include/net/page_pool/helpers.h
so we can remove the dummy header file.
Sponosred by: The FreeBSD Foundation
MFC after: 3 days
Implement lowering for omp.iterator in affinity
Create IteratorLoopNestScope for building nested loop for iterator.
Take advantage of RAII so that we can have correct exit for each
level of the loop.
Refactor and support multiple affinity register for a task
- Support multiple affinity register for a task
- Move iterator loop generate logic to OMPIRBuilder
- Extract iterator loop body convertion logic
- Refactor buildAffinityData by hoisting the creation of affinity_list
- IteratorsOp -> IteratorOp
- Add mlir to llvmir test
[mlir][llvmir][OpenMP] Translate affinity clause in task construct to llvmir
Translate affinity entries to LLVMIR by passing affinity information to
createTask (__kmpc_omp_reg_task_with_affinity is created inside PostOutlineCB).
Revert "[VPlan] Use VPInstructionWithType for Load in VPlan0 (NFC)"
This reverts commit 2576ee1fd93fb87699650734ffafdb8092062d59.
This was causing test failures when running check-llvm-unit.
[NVPTX] Support intrinsics for reserved shared memory special registers (#182354)
Added reserved_smem_offset_{begin|end|cap|0} intrinsics to expose shared
memory special registers and NVPTX TableGen support for these
intrinsics.