[CIR][NEON] Add lowering for `vnegd_s64` and `vnegh_f16` (#180597)
Add CIR lowering support for the non-overloaded NEON intrinsics
`vnegd_s64` and `vnegh_f16`.
The associated tests are shared with the existing default codegen tests:
* `neon-intrinsics.c` → `neon/intrinsics.c`
* `v8.2a-fp16-intrinsics.c` → `neon/fullfp16.c`
A new test file,
* `clang/test/CodeGen/AArch64/neon/fullfp16.c`
is introduced and is intended to eventually replace:
* `clang/test/CodeGen/AArch64/v8.2a-fp16-intrinsics.c`
Since both intrinsics are non-overloaded, the CIR and default codegen
handling is moved to the appropriate switch statements. The previous
placement was incorrect.
This change also includes minor refactoring in `CIRGenBuilder.h` to
better group related hooks.
NAS-139714 / 26.0.0-BETA.1 / Validate capabilities_state keys in container create/update (#18169)
## Context
We were missing validation for capabilities state which meant that any
invalid value provided by consumer would get stored in the database even
though if it won't have any effect in usage with `nsenter` but still we
should not allow this to happen in the first place.
[RISCV][CodeGen] Combine vwaddu+vabd(u) to vwabda(u)
Note that we only support SEW=8/16 for `vwabda(u)`.
Reviewers: topperc, lukel97, preames
Reviewed By: topperc, lukel97
Pull Request: https://github.com/llvm/llvm-project/pull/180162
[MLIR][NVVM][NFC] Fix PTX builder class api (#180787)
Previously, `NVVM_PTXBuilder_Op` included `BasicPtxBuilderOpInterface`
as part of the default value of the `traits` parameter. This meant any
subclass that provided an explicit traits list would silently replace
the default and lose the interface, defeating the purpose of the base
class. Callers had to redundantly re-specify the interface.
[SLP] Use the correct calling convention for vector math routines (#180759)
When vectorising calls to math intrinsics such as llvm.pow we
correctly detect and generate calls to the corresponding vector
math variant. However, we don't pick up and use the calling
convention for the vector math function. This matters for veclibs
such as ArmPL where the aarch64_vector_pcs calling convention
can improve codegen by reducing the number of registers that
need saving across calls.
[AArch64] Eliminate XTN/SSHLL for vector splats
Combine:
sext(duplane(insert_subvector(undef, trunc(X), 0), idx))
Into:
duplane(X, idx)
This avoids XTN/SSHLL instruction sequences that occur when splatting
elements from boolean vectors after type legalization, which is common
when using shufflevector with comparison results.
[lldb] Step over non-lldb breakpoints (#174348)
Several languages support some sort of "breakpoint" function, which adds
ISA-specific instructions to generate an interrupt at runtime. However,
on some platforms, these instructions don't increment the program
counter. When LLDB sets these instructions it isn't a problem, as we
remove them before continuing, then re-add them after stepping over the
location. However, for breakpoint sequences that are part of the
inferior process, this doesn't happen - and so users might be left
unable to continue past the breakpoint without manually interfering with
the program counter.
This patch adds logic to LLDB to intercept SIGTRAPs, inspect the bytes
of the inferior at the program counter, and if the instruction looks
like a BRK or BKPT or similar, increment the pc by the size of the
instruction we found. This unifies platform behaviour (e.g. on x86_64,
LLDB debug sessions already look like this) and improves UX (in my
opinion, but I think it beats messing with stuff every break).
[21 lines not shown]
[libc++] Only make comparators transparent in __tree if they don't cause a conversion (#179453)
We're currently unwrapping `less<T>` even if the `key_type` isn't `T`.
This causes the removal of an implicit conversion to `const T&` if the
types mismatch. Making `less<T>` transparent in that case changes
overload resolution and makes it fail potentially.
Fixes #179319
(cherry picked from commit 9d2303103288f6110622644f78dbd26c8bcf28d5)
[IndVarSimplify] Add safety check for getTruncateExpr in genLoopLimit (#172234)
getTruncateExpr may not always return a SCEVAddRecExpr when truncating
loop bounds. Add a check to verify the result type before casting, and
bail out of the transformation if the cast would be invalid.
This prevents potential crashes from invalid casts when dealing with
complex loop bounds.
Co-authored by Michael Rowan
Resolves [#153090](https://github.com/llvm/llvm-project/issues/153090)
[lld][Hexagon] Fix R_HEX_TPREL_11_X relocation on duplex instructions (#179860)
findMaskR11() was missing handling for duplex instructions. This caused
incorrect encoding when R_HEX_TPREL_11_X relocations were applied to
duplex instructions with large TLS offsets.
For duplex instructions, the immediate bits are located at positions
20-25 (mask 0x03f00000), not in the standard positions used for
non-duplex instructions.
This fix adds the isDuplex() check to findMaskR11() to return the
correct mask for duplex instruction encodings.
(cherry picked from commit 62d018b87a161bb2797c1ed03a482ffcdc8b162c)
[lldb][Process/FreeBSDKernel] Add links to pcb.h (#180267)
We had consensus in #178556 to use cgit links for this kind of use
cases.
Signed-off-by: Minsoo Choo <minsoochoo0122 at proton.me>
lagg: Remove the member pr_num from struct lagg_proto
It is set but never used. Remove it to avoid confusion and save a
little space.
While here, use designated initializers to initialize the LAGG protocol
table. That improves readability, and it will be safer to initialize the
table if we introduce new protocols in the future.
No functional change intended.
Reviewed by: glebius
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D55124
(cherry picked from commit 5ba503fc2cabc1a614997f102ace671d996bcc53)
qlnxe: Refactor setting the promiscuous and allmulti mode
There are two entry points to set the promiscuous and allmulti mode.
One is ioctl, and another is the init routine. Given they share almost
the identical logic, refactor a little to make the code more clear.
While here, for the ioctl, translate the error to EINVAL to avoid
confusing the net stack.
Reviewed by: kbowling
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D54890
(cherry picked from commit 45b1718fadae7d56051ba04ef9d7a175a602a226)
lagg: Make the none protocol a first-class citizen
All the other protocols have corresponding start and input routines,
which are used in the fast path. Currently the none protocol is
treated specially. In the fast path it is checked to indicate whether
a working protocol is configured. There are two issues raised by this
design:
1. In production, other protocols are commonly used, but not the
none protocol. It smells like an overkill to always check it in the
fast path. It is unfair to other commonly used protocols.
2. PR 289017 reveals that there's a small window between checking the
protocol and calling lagg_proto_start(). lagg_proto_start() is possible
to see the none protocol and do NULL deferencing.
Fix them by making the none protocol a first-class citizen so that it
has start and input routines just the same as other protocols. Then we
can stop checking it in the fast path, since lagg_proto_start() and
[15 lines not shown]
qlnxe: Overhaul setting the multicast MAC filters
When operating the multicast MAC filters, the current usage of
ECORE_FILTER_ADD and ECORE_FILTER_REMOVE are rather misleading.
ECORE_FILTER_ADD reads "adding new filter", but it actually removes
any existing filters and then addes a new one. ECORE_FILTER_REMOVE
reads "removing a filter", but it actually removes all filters.
Let's use ECORE_FILTER_REPLACE and ECORE_FILTER_FLUSH instead to
avoid confusion.
In the current implementation, only one MAC address is passed to
ecore_sp_eth_filter_mcast() and any previously installed filters are
removed, hence it breaks the multicast function. That can be observed
via either assigning new IPv6 addresses to the interface or putting
the interface as a member of lagg(4) interface with LACP aggregation
protocol. Fix that by calculating the multicast filter bins directly
from multicast MAC addresses and replace the filters every time
the bins changes.
[20 lines not shown]
qlnxe: Allow tapping the TX packets
Currently only the packets in the RX path can be captured by tcpdump
as the ETHER_BPF_MTAP call in the TX path is missing. Add it so that
packets in both directions can be captured.
PR: 290973
Reviewed by: kbowling
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D54891
(cherry picked from commit 968647502ec21464ad3aecc7577ff0e8dfd41693)
qlnxe: Let ether_ioctl() handle SIOCSIFADDR ioctl
Since the change [1], the init routine qlnx_init() works as intended.
Let ether_ioctl() handle SIOCSIFADDR to simplify the code.
Combined with the change [1], this shall be a better fix for PR 287445.
[1] c10e6bc0f007 qlnxe: Avoid reinitializing the interface when it is already initialized
PR: 287445
Reviewed by: kbowling
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D54888
(cherry picked from commit 4012b63889e40bb877bc0e4c8da1792bce472c08)
qlnxe: Avoid reinitializing the interface when it is already initialized
qlnx_init_locked() unconditionally uninitialize the interface thus is
actually reinitializing the interface. Well the init routine qlnx_init()
is to initialize the interface by net stack when assigned with the first
inet or inet6 address. The ioctl SIOCSIFADDR for the first inet6 address
is handled by ether_ioctl() thus the interface is reinitialized no matter
it was initialized or not.
Add a driver status check for that to avoid reinitializing. Further plan
is removing SIOCSIFADDR ioctl from the driver and let ether_ioctl() handle
it.
Reviewed by: kbowling
MFC after: 5 days
Differential Revision: https://reviews.freebsd.org/D54887
(cherry picked from commit c10e6bc0f0079e90cb484323ad71d437f1882422)