[ARM] Remove IR from mve vpt mir tests. NFC
As far as I can tell the llvm.arm.mve.vminnm.m intrinsic used in these tests
was the pre-upstream name of llvm.arm.mve.min.predicated. The tests should not
need IR sections, so remove them just relying on the MIR portions.
[mlir][llvm] Fix import of branch weights with "expected" field (#169776)
This commit fixes the import of `branch_weights` metadata from LLVM IR
to the LLVM dialect. Previously, `branch_weights` metadata containing
the `!"expected"` field were rejected because the importer expected
integer weights at operand 1, but found a string.
[OpenMP][flang] Support GPU team reductions on allocatables
Extends the work started in #165714 by supporting team reductions.
Similar to what was done in #165714, this PR introduces proper
allocations, loads, and stores for by-ref reductions in teams-related
callbacks:
* `_omp_reduction_list_to_global_copy_func`,
* `_omp_reduction_list_to_global_reduce_func`,
* `_omp_reduction_global_to_list_copy_func`, and
* `_omp_reduction_global_to_list_reduce_func`.
openssl cms: switch to ASN1_STRING_get0_data()
The deprecated ASN1_STRING_data() will be removed in a future release.
This is one small step towards that.
ok kenjiro
openssl pkcs12: stop reaching into ASN1_STRING
Buy a t: rename hex_prin() to hex_print() and accept an ASN1_STRING so that
we only need to use accessors once. Also avoid a printf %s NULL.
ok kenjiro
[AggressiveInstCombine] Match long high-half multiply (#168396)
This patch adds recognition of high-half multiply by parts into a single
larger multiply.
Considering a multiply made up of high and low parts, we can split the
multiply into:
x * y == (xh*T + xl) * (yh*T + yl)
where `xh == x>>32` and `xl == x & 0xffffffff`. `T = 2^32`.
This expands to
xh*yh*T*T + xh*yl*T + xl*yh*T + xl*yl
which I find it helpful to be drawn as
[ xh*yh ]
[ xh*yl ]
[ xl*yh ]
[ xl*yl ]
We are looking for the "high" half, which is xh*yh + xh*yl>>32 + xl*yh>>32 +
carrys. The carry makes this difficult and there are multiple ways of
[15 lines not shown]
[InstCombine] Fold @llvm.experimental.get.vector.length when cnt <= max_lanes (#169293)
On RISC-V, some loops that the loop vectorizer vectorizes pre-LTO may
turn out to have the exact trip count exposed after LTO, see #164762.
If the trip count is small enough we can fold away the
@llvm.experimental.get.vector.length intrinsic based on this corollary
from the LangRef:
> If %cnt is less than or equal to %max_lanes, the return value is equal
to %cnt.
This on its own doesn't remove the @llvm.experimental.get.vector.length
in #164762 since we also need to teach computeKnownBits about
@llvm.experimental.get.vector.length and the sub recurrence, but this PR
is a starting point.
I've added this in InstCombine rather than InstSimplify since we may
need to insert a truncation (@llvm.experimental.get.vector.length can
[3 lines not shown]