[LLVM][DAGCombiner] Limit extract_subvec(extract_subvec()) combine to vectors of the same type. (#187334)
The index operand of ISD::EXTRACT_SUBVECTOR is implicitly scaled by
vscale, which is effectively always one for fixed-length vectors. When
combining nested extracts we must ensure all use the same implicit
scaling otherwise the transform is not equivalent.
Fixes https://github.com/llvm/llvm-project/issues/186563
[AArch64][GlobalISel] Add pattern to lower aarch64.neon.sqdmulls.scalar
SDAG was able to lower this intrinsic because it was turned into an AArch64sqdmull ISD node before instruction selection. As aarch64.neon.sqdmull and aarch64.neon.sqdmulls.scalar are two different LLVM intrinsics, we need two separate patterns to lower them correctly.
[STLForwardCompat] Switch transformOptional from direct call to invoke (#186333)
This allows to pass pointer to member (or member function) alongside
other callable objects. Also adjusted return value as std::optional of
reference types are forbidden.
[TabelGen] Use ID{n-m} for outer let statements (#187436)
I found this occasionally.
For outer let statements, if we want to override some bits, we specify
the range list in the form of `<n-m>`. But for inner let statements,
we use `{n-m}`.
This is inconsistent, and I can't find the reason why it is designed
as this. So here we make inner/outer let statements consistent and
remove the duplicated parsing functions.
There is only one in-tree usage so I think the impact is small.
[Support] Use block numbers for LoopInfo BBMap (#103400)
Replace the DenseMap from blocks to their innermost loop a vector
indexed by block numbers, when possible. Supporting number updates is
not trivial as we don't store a list of basic blocks, so this is not
implemented.
NB: I'm generally not happy with the way loops are stored. As I think
that there's room for improvement, I don't want to touch the
representation at this point.
Pull Request: https://github.com/llvm/llvm-project/pull/103400
[Analysis][NFC] Include LoopInfoImpl only in source file (#187459)
There's no need to include the full LoopInfo implementation in every
source file that uses LoopInfo.
Pull Request: https://github.com/llvm/llvm-project/pull/187459
[DebugInfo] Fix segfault in constructSubprogramScopeDIE with null subprogram type (#184299)
Guard against null DISubroutineType when checking for variadic
parameters in `constructSubprogramScopeDIE`. `DISubprograms` may lack a
type field when using LineTablesOnly emission, causing a null pointer
dereference.
Fixes #184003
Co-authored-by: Shivam Kunwar <phyBrackets at users.noreply.github.com>
[MemorySSA] Fix handling of cross-iteration dependencies for calls (#187291)
The clobber walker has to be careful when it comes to translating
locations across phis. If we're translating across a cycle backedge,
we'll end up working with SSA values from two different cycle iterations
-- something that alias analysis by default assumes is not the case.
To protect against this, the upwards def walk was already discarding the
access size from the memory location if the pointer was not loop
invariant. This (mostly) avoids this issue for memory locations.
However, the same issue also exists for calls. In this case, it's not
possible to adjust the call used for AA queries in a similar way.
Instead, we can make use of the cross-iteration alias analysis mode,
which has been added some time ago for these kinds of situations.
The basic change here is that the upwards def walk, when translating
across a phi, will enable the cross-iteration mode for calls.
Unfortunately, quite a few places have to be changed in order to thread
[8 lines not shown]
[AArch64][SVE] Prefer FMOV for scalar insert into first element of zero. (#187236)
Currently, when inserting a scalar into the first element of a
zero-initialised scalable vector, we zero the register explicitly and
emit a predicated move. However, an FMOV should be preferable as it
implicitly zeros the upper bits of the destination.
libclc: Update f64 trig functions (#187455)
Most of of this was originally ported from rocm
device libs in 2e6ff0c66e180998425776a27579559dc099732f. Merge
in more recent changes.
libclc: Really implement denormal config checks (#187356)
These should be implementable by checking the behavior of
the canonicalize intrinsic. Hack around spirv still failing
on canonicalize by overriding and assuming DAZ for float.
libclc: Really implement denormal config checks
These should be implementable by checking the behavior of
the canonicalize intrinsic. Hack around spirv still failing
on canonicalize by overriding and assuming DAZ for float.