[AArch64][llvm] Remove `+d128` gating on `sysp`, `msrr` and `mrrs` instructions (#178912)
Remove `+d128` gating on `sysp`, `msrr` and `mrrs` instructions.
We removed gating for `sys`, `mrs` and `mrs` instructions previously,
on the basis that it doesn't add value, as it doesn't indicate that
any particular system registers or system instructions are available.
Therefore, remove `+d128` gating for these instructions too.
(In upcoming change #178913, some `tlbip` instructions, which are `sysp`
aliases are allowed to be used with either `+d128` or `tlbid`. If we don't
remove this gating, then it would require some ugly work-arounds in the
code to support the relaxation mandated by the 2025 MemSys specification.
In this change, retain `+d128` gating for all `tlbip` instructions, which
will then be loosened to either `+d128` or `+tlbid` in a subsequent
change)
[DomFrontier] Fix precedence in assert. NFC (#182239)
This fixes the warning about parentheses around ‘&&’ within ‘||’, until
DFs with multiple roots are supported.
libpfctl: Sort order of snl attribute parser
snl atttribute parsers must be sorted by type, so PF_GS_BCOUNTERS
(16) must follow PF_GF_PCOUNTERS (15). Fix ordering and add a call
to SNL_VERIFY_PARSERS.
Without this fix, byte counters reported by 'pfctl -s info' with
a loginterface are always zero.
PR: 291763
MFC after: 1 week
Reviewed by: kp
Signed-off-by: eborisch at gmail.com
(cherry picked from commit 363b57d579bafa8a52cfb5a1dcb98af821b1ecb6)
[X86] For CMP_MASK_CC/CMP_MASK_SCALAR_CC convert CC from MVT::i32 to MVT::i8. (#182199)
The underlying X86ISD nodes have type profiles that say MVT::i8.
Fixes one of the errors found by #168421.
[X86] Emit ISD::ADD instead of X86ISD::ADD from combineSubSetcc. NFC (#182195)
The flag result isn't used so the X86ISD::ADD would be converted to
ISD::ADD by a DAGCombine immediately after.
Prior to this we could create a X86ISD::ADD with an illegal type and we
were using the wrong VT for the flag result.
[ARM] Replace manual CLS expansion with ISD::CTLS (#178430)
Converts ARM scalar CLS intrinsics to use the unified ISD::CTLS node
instead of custom manual expansion. This addresses the issue
[#174337](https://github.com/llvm/llvm-project/issues/174337).
Co-authored-by: Craig Topper <craig.topper at sifive.com>
[MLIR] Add trivial simplifications for affine mod, div, ceil (#182234)
Add missing trivial folding rules for div and mod affine expressions
when the LHS and RHS were the same.
[LowerMatrixIntrinsics] Avoid use of ptrtoint (#182289)
The ptrtoint result here is used in icmp. However, icmp can already
directly work with pointers, so there's no need to perform the cast.
(I originally wanted to switch this to ptrtoaddr, but that's not really
necessary when we can directly compare on pointers.)
[RISCV] Rename $dest to $passthru. NFC (#182231)
Most instructions used $passthru, the only ones that use $dest seem to
be non-segment NoMask loads. I don't think there is any reason to be
different.
InstCombine: Fold bitcast of vector with constant to scalar
Fold bitcast (select cond, val, const) ->
select cond, (bitcast val), (bitcast const)
Rocm device libs has an unfortunate amount of code that does bithacking
on the sign bit of double values by casting to <2 x i32> and operation
on the high element. This breaks value tracking optimizations on the
fp value.
The existing transform would only do this if the input to the select was
also a bitcast with a single use, and if it didn't convert between vector
and scalar.