InstCombine: Rudimentary support of shufflevector in SimplifyDemandedFPClass
This should look more like the computeKnownFPClass handling, with knowledge
of demanded vector elements.
InstCombine: Fix defining undef constant vector elts in SimplifyDemandedFPClass
Fold constants of known single class to the original constant instead of
a new constant. This avoids overdefining vector elements that were originally
undefined with the splat constant.
[SelectionDAG] Remove OPC_EmitStringInteger from isel. (#173936)
Instead emit this as an OPC_EmitInteger, but print the string
when the value is known to be 0..63 (when we don't need a VBR).
Also print the string into a comment when comments are not omitted
so it isn't lost when a VBR is needed.
[SLP]Mark and incompatible for 'xor %a, 0' operations
Xor with 0 is incompatible with and, which resulst in all zero instead
of %a
https://alive2.llvm.org/ce/z/oEVETS
Fixes #174041
InstCombine: Rudimentary support of shufflevector in SimplifyDemandedFPClass
This should look more like the computeKnownFPClass handling, with knowledge
of demanded vector elements.
Revert -mcpu fix (#174093)
Reverts #173399 and #174004.
#173399 moved MemoryBuffer::getFileOrSTDIN below the -mcpu validation to
fix the `-mcpu=help` flag , but on cross builds the first CPU is
rejected before the “file not found” diagnostic is printed. This failed
lit tests. #174004 introduced a host CPU fallback to fix the cross
compilation issue, but this still fails on NVPTX builders.
This can be revisited when a fix is found that works with the NVPTX
builders.
InstCombine: Fix defining undef constant vector elts in SimplifyDemandedFPClass
Fold constants of known single class to the original constant instead of
a new constant. This avoids overdefining vector elements that were originally
undefined with the splat constant.
InstCombine: Handle extractelement in SimplifyDemandedFPClass (#174081)
A lot of boilerplate changes are necessary to do proper elementwise
tracking like SimplifyDemandedBits
[mlir][int-range] `IntRangeNarrowingPass` was missing `SparseConstantPropagation` analysis (#174088)
This was causing it to skip nested scf ops in some cases (see `scf.for`
test). Use convenience `loadBaselineAnalyses` func.
[X86][AMX-AVX512] Add *i intrinsics for immediate variants (#173545)
The immediate variants use the low 6-bit as row index, while register
variants use low 16-bit. We cannot select the immediate variants using
the same intrinsic. So let's add new intrinsics for them.
[VPlan] Only use legacy cost for instructions only used by exit conds. (#174029)
Currently we need to precompute costs for exit conditions, to match the
legacy cost, as they will get replaced by a compare against the
canonical IV (or others, like active-lane-mask or EVL based) and the
original compare will get removed.
This is not true for instructions with users other than the exit
condition. Those will remain, and we can just use the VPlan-based cost
model to get more accurate results.
This improves results in some cases, like
@test_value_in_exit_compare_chain_used_outside because the IV increment
user outside the loop is replaced by computing the final value outside
the loop.
It also fixes a crash introduced by f196b1d66ff (#146525).
PR: https://github.com/llvm/llvm-project/pull/174029