[X86][GlobalISel] Improve carry value selection (#146586)
Generally G_UADDE, G_UADDO, G_USUBE, G_USUBO are used together and it
was enough to simply define EFLAGS. But if extractvalue is used, we end
up with a copy of EFLAGS into GPR.
Always generate SETB instruction to put the carry bit on GPR and CMP to
set the carry bit back. It gives the correct lowering in all the cases.
Closes #120029
[SLP]INsert postponed vector value after all uses, if the parent node is PHI
Need to insert the vector value for the postponed gather/buildvector
node after all uses non only if the vector value of the user node is
phi, but also if the user node itself is PHI node, which may produce
vector phi + shuffle.
Fixes #162799
[LV] Bail out on loops with switch as latch terminator.
Currently we cannot vectorize loops with latch blocks terminated by a
switch. In the future this could be handled by materializing appropriate
compares.
Fixes https://github.com/llvm/llvm-project/issues/156894.
Fix typo: IsGlobaLinkage -> IsGlobalLinkage in XCOFF (#161960)
Corrects the spelling of 'IsGlobaLinkage' to 'IsGlobalLinkage' in
XCOFF-related code, comments, and tests across the codebase.
TableGen: Account for Unsupporte LibcallImpl in bitset size
The Unsupported case is special and doesn't have an entry in the
vector, and is directly emitted as the 0 case. This should be
harmless as it is, but could break if the right number of new
libcalls is added.
[SLP]Support non-ordered copyable argument in non-commutative instructions
If the non-commutative user has several same operands and at least one
of them (but not the first) is copyable, need to consider this
opportunity when calculating the number of dependencies. Otherwise, the
schedule bundle might be not scheduled correctly and cause a compiler
crash
Fixes #162925
[Clang] Preserve more sugars in constraint evaluation (#162991)
Using the canonical form of SugarConverted was an oversight during the
iteration of e9972debc9. We now retain sugar for better diagnostics.
[ADT] Simplify CheckedInt::from with llvm::to_underlying (NFC) (#163038)
llvm::to_underlying, forward ported from C++23, conveniently packages
static_cast and std::underlying_type_t like so:
static_cast<std::underlying_type_t<EnumTy>>(E)
[ADT] Simplify addEnumValues with llvm::to_underlying (NFC) (#163037)
llvm::to_underlying, forward ported from C++23, conveniently packages
static_cast and std::underlying_type_t like so:
static_cast<std::underlying_type_t<EnumTy>>(E)
[TableGen] Support for optional chain in Selection DAG nodes
This change adds a new property for Selection DAG nodes used in pattern
descriptions: SDNPMayHaveChain. A node with this property may have or
may not have a chain operand. For example, both of the following
variants become valid:
t3: f32,ch = fnearbyint t0, t2
t3: f32 = fnearbyint t2
The specific variant is determined during pattern matching, based on
whether the first operand is a chain (i.e. has the type MVT::Other).
This feature is intended to be used for floating point operations. They
have side effects in a strictfp environment and are pure functions in
the default FP environment. Currently each such operation requires two
opcodes - one for each kind of FP environment. These opcodes represent
the same operation and are processed similarly, which increase amount of
code. With this feature the support of strictfp environment should be
easier, as it can use the same opcode as the default environment.
JobserverTest.cpp: Suppress a warning. [-Wunused-lambda-capture]
I don't know how to mark an item as `maybe_unused` on capture list.
I also guess `i` may be removed out of byval capture.
[AArch64] Optimize extending loads of small vectors
Reduces the total amount of loads and the amount of moves between SIMD
registers and general-purpose registers.
[AArch64] Optimize DUP of extending loads to avoid GPR->FPR transfer
Loads the data into the SIMD register, thus sparing a physical register
and a potentially costly movement of data.
[VPlan] Set flags when constructing truncs using VPWidenCastRecipe.
VPWidenCastRecipes with Trunc opcodes where missing the correct OpType
for IR flags. Update createWidenCast to set the correct flags for
truncs, and use it consistenly.
Fixes https://github.com/llvm/llvm-project/issues/162374.
[AArch64] Optimize DUP of extending loads to avoid GPR->FPR transfer
Loads the data into the SIMD register, thus sparing a physical register
and a potentially costly movement of data.
[AArch64] Optimize extending loads of small vectors
Reduces the total amount of loads and the amount of moves between SIMD
registers and general-purpose registers.