[AArch64] Add sqneg tablegen patterns (#196265)
This adds some tablegen patterns for sqneg instructions, largely copied
from the equivalent MVE patterns. They perform a saturating negation,
effectively just protecting against INT_MIN, which is equivalent to a
`ssub_sat 0, R`.
[libclc] Canonicalize 'clspv' to the 'spirv-unknown-vulkan' triple (#196351)
Summary:
The libclc project has clspv support for exporting OpenCL standard
library utilities to Vulkan consumers. This was previously exposed as a
hack into the build system that renamed the triple and relied on macro
defines. Recent changes allowed us to use `vulkan` as an OS for the
spir-V target. This should make the intention more clear and allow the
system to inherit the same triple handling the other targets use.
Tested the build, but I will need @rjodinchr and @alan-baker to verify.
AMDGPU/GlobalISel: Switch to extended LLTs
Switch is required to be able to translate bfloat.
After the switch most of the codegen patterns now require explicit
type on register to match instead of LLT::scalar.
So we can still use LLT::scalar for type checks but new instructions
created during lowerings/combines need to use propper extended LLT.
inst select test sources fully switched to i32/f32 so patterns can match
for legalizer and regbanklegalize left as is (should probably be switched
as well)
New functionality worth noting is f16 and bitcast lowering to i32
f16 = g_bitcast i16
->
i32 = g_anyext i16
f16 = g_trunc i32
f16 = trunc i32 is legal
[SLP] Account for GEP pointer-chain cost when root scalars feed load/store indices
When every external use of the root TreeEntry's scalars is a GEP with a
single load or store user (sharing one access type) and all lanes are
consumed this way, charge the delta between the vector (unknown stride)
and scalar (unit stride) pointer-chain costs once via
TTI::getPointersChainCost, scaled for the root entry. Vectorizing such
a root forces lane extracts or a vector GEP to drive address
computation, which is typically more expensive than keeping the indices
scalar in a unit-stride address chain.
Reviewers: hiraditya, bababuck, RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/192726
devel/R-cran-rlang: Update to 1.2.0
Current version has been broken for some days.
PR: 295048
Reported by: einar at isnic.is
Approved by: blanket approval
[X86] Add test coverage showing failure to fold freeze(fnearbyint(x)) -> fnearbyint(freeze(x)) (#196521)
Use ftrunc + fnearbyint/fround/froundeven/frint/ftrunc/ffloor/fceil fold to show failure
Services: Kea DHCPv4/6: Build reservation status from control socket output, so it matches the scope of individual subnets as well. Add client-id since it's relevant for IPv4 leases as well in default configuration.
[mlir][core] in -mlir-print-ir-*, dump the pass options as well (#195198)
This change modifies the header comment to IR dumped by
`-mlir-print-ir-*` flags. The new comment contains the exact pass
pipeline run for the pass in question. This is useful when using
`mlir-print-ir-tree-dir`, as it provides the exact reproducing pass
pipeline that can be used on the dumped IR.
For example, when using --mlir-print-ir-before-all when triaging a stack
trace, the last dumped IR (along with this new comment) can be used to
reproduce the failure with a single pass.
Before:
```
// -----// IR Dump Before CanonicalizerPass (canonicalize) //----- //
```
After:
[7 lines not shown]
Captive Portal: re-introduce hash lookup for accounting purposes
Table redirection allowed for constant time lookups, with the
migration to pf this was changed to a linear time lookup.
While here, fix a small edge case that kills states for ips
flipping primary IPs according to hostwatch. Also make sure
to include the set of ipfw keys to "registered addresses" to make
sure theyre properly cleaned up from the table
[AMDGPU] Make VALU instructions defining SGPR non-ignorable (#195270)
This fixes an issue where CSE would incorrectly eliminate an instruction
that produces a lane mask. For example, the second V_CMP_GT in the code
below cannot be replaced with %3, despite both having the same operands
as it would cause an incorrect exec mask being calculated in %6:
```
bb.1
%3:sreg_64 = V_CMP_GT_U32_e64 %0:vgpr_32, %1:sreg_32, implicit $exec
%4:sreg_64 = SI_IF_BREAK killed %3:sreg_64, %2:sreg_64, implicit-def dead $scc
SI_LOOP %4:sreg_64, %bb.1, implicit-def dead $exec, implicit-def dead $scc, implicit $exec
S_BRANCH %bb.2
bb.2:
SI_END_CF %4:sreg_64, implicit-def dead $exec, implicit-def dead $scc, implicit $exec
%5:sreg_64 = V_CMP_GT_U32_e64 %0:vgpr_32, %1:sreg_32, implicit $exec
%6:sreg_64 = S_AND_B64 %5:sreg_64, $exec, implicit-def $scc
```
[3 lines not shown]