[libc++] Resolve LWG4308, correct `iterator` availability for `optional<T&>` (#173948)
Resolves #171345
Implements [proposed resolution for
LWG4308](https://cplusplus.github.io/LWG/issue4308) and removes
`const_iterator` from `optional<T&>`, which was missed.
- Constrains iterator to only be available if T is not an lvalue
reference, or if it is T&, that T is an object type and is not an
unbounded array
- Add a partial specialization for `__optional_iterator` for `T&`, which
only has the `iterator` type.
- Correct a static assert message as a drive-by
- Move the libcxx specific iterator test into the standard test because
the standard now specifies when the iterator should be available
[AArch64] - Allow for aggressive unrolling, with non-zero LoopMicroOpBufferSize for Oryon. (#172422)
Due to LoopMicroOpBufferSize being 0 value in Oryon machine model,
unrolling based on runtime TC was disabled. This is a pseudo value as
Oryon-1 does not have loop-uop buffer in it's micro-architecture. The
value 16 is empirical and inspired by machine model of cortex-a57 and
can be further tuned if required.
[SelectionDAG] Use SLEB128 for signed integers in isel table instead of 'signed rotated'. NFC (#173928)
Previously, we used a VBR that stored the sign bit in bit 0 followed by
the absolute value in subsequent bits.
This patch changes it to use SLEB128 which discards redundant sign bits,
but keeps the bits in the same positions. This uses the same number of
bytes to encode values so doesn't change the table size.
My goal is to remove OPC_EmitStringInteger as a special opcode type.
Instead, we can print the string directly with OPC_EmitInteger for any
string that has an enum value of 0..63.
[mlir][tensor] Preserve encoding in `CollapseShapeOp::build` (#173720)
This PR updates `CollapseShapeOp::build` so that when the result type is
not explicitly provided, the inferred result type preserves the encoding
of the source tensor.
[GlobalISel] Implement G_UADDO/G_UADDE/G_SADDO/G_SADDE for computeKnownBits (#165497)
Addressing the carry out cases Matt mentioned in #159202.
Note: G_[US]SUB[OE] will be implemented in a different PR.
[SelectionDAG] Use uint8_t instead of unsigned char for isel MatcherTable. (#174014)
These are really the same type, but uint8_t is more accurate since we
make assumptions that a table element is 8 bits when we emit VBRs.
[Clang] Add NUW to the Sub in __builtin_clrsb expansion. (#174010)
The ctlz will produce a value in the range [1..bitwidth]. It can't
produce 0. This means the subtract of 1 will not have unsigned wrap.
It also has no signed wrap, but the optimizer can figure that out on its
own.
It's very likely InstCombine will just drop the NUW when it
canonicalizes to Add, but maybe it will be helpful in some case.
[VPlan] Re-use common cast cost logic for VPReplicateRecipe (NFCI).
Move the logic to compute cast costs to getCostForRecipeWithOpcode and
use for VPReplicateRecipe.
This should match the costs computed by the legacy cost model for scalar
casts.
[flang][cuda] Make copy to managed variable on host (#174012)
When the LHS has multiple symbols with the managed attribute, still
perform the copy on the host.
InstCombine: Introduce nsz flag on minimum/maximum in SimplifyDemandedFPClass
Alive isn't particularly happy with this in the case where
one of the inputs could be zero, but I think
it's wrong: https://alive2.llvm.org/ce/z/dF7V6k
nsz shouldn't permit introducing a -0 result where
there wasn't one in the input here.
InstCombine: Consider not-inf/nan context when simplifying fmul
Consider if the result can be nan, or if the inputs cannot
be infinity from the flag when trying to simplify fmul into
copysign.