[InstCombine] Fold fpto{u,s}i of int-cast fdiv into {u,s}div (#205853)
Fixes #205305.
Adds an InstCombine fold for the pattern `fpto{u,s}i (fdiv ({u,s}itofp
X), C)` to `{u,s}div X, C`.
Safe when
- Unsigned: C > 0 and the integer width N <= the FP mantissa width p.
- Signed: C != 0 and N - 1 <= p, excluding (X == INT_MIN, C == -1).
See linked issue for detailed reasoning.
[InstCombine] Fold commuted add of udiv/urem by two (#206272) (#207462)
Fixes #206272.
`SimplifyAddWithRemainder` folds `(X / C0) * C1 + (X % C0) * C2`,
treating `and X, lowmask` as a remainder and `lshr X, N` as a division.
The commuted form `add (and X, C), (lshr X, N)` was missed because the
operand-order swap only recognized a real `urem`/`srem`.
Now the fold is tried with both operand orders instead of relying on
that swap. Verified with Alive2.
Supersedes #207249 (re-opened from the correct account; already
incorporates the both-operand-orders refactor suggested there by
nikic).
Prepared with AI assistance per the [LLVM AI Tool
Policy](https://llvm.org/docs/AIToolPolicy.html); not a "good first
issue".
[3 lines not shown]
[MC][NFC] Store SubTypeKV names as string table (#207580)
This moves the large SubTypeKV arrays to .rodata, as they no longer
contain the key pointers that need to be relocated.
Additionally, remove the largely redundant CPUNames arrays and integrate
the AArch64 aliases into the sorted string table. There was really no
need to introduce these 17 kiB arrays solely for including AArch64
aliases in help output.... (added in b6c22a4)
[GlobalISel] Add or_and_and pattern from SelectionDAG (#204618)
This PR adds the `fold or (xor x, y), (x and/or y) --> or x, y` pattern
from SelectionDAG to GlobalISel.
[CodeGen][NFC] Store CPU model index in SubTypeKV (#207351)
Instead of storing the pointer to the CPU model, store all CPU models in
an array and store the index. This is preliminary work for moving
SubTypeKV to .rodata.
NB: the scheduling models also take a fair amount of space in
.data.rel.ro, with SchedModels ~13kiB, ModelProcResources ~57kiB.
[GenericDomTreeConstruction] Use 0-based DFS numbering (#207524)
Number DFS-visited nodes from 0 and drop the unused index-0 sentinel
from NumToNode/NumToInfo/IDoms.
`Unvisited = 0` marks unvisited nodes by DFS. 0 is now the DFS root, or
the virtual root for postdominators.
Decrease the inline capacity for NumToNode/NodeInfos, which doesn't seem
to matter. sqlite3's p90 block count is 29.
[libc++] Mark LWG4098 as resolved (#206295)
Already implemented and tested in the scope of the full implementation
for `adjacent_view` (1e15dbe311eb08462e7a68fcb8b5850632e24aff).
Closes #105353
[SPARC] Don't combine misaligned memory ops with BSWAP (#206345)
Doing it will result in a misaligned LD*A/ST*A instruction, which will
raise a bus error.
This should fix the failure in `clamscan` test.
[LifetimeSafety] Support field-sensitivity in lifetime tracking
This patch enables field-sensitivity when tracking lifetimes of nested objects.
- FactsGenerator now generates `PathElement::getField` for `MemberExpr` accesses, mapping fields to loans.
- LoanPropagation now propagates field paths along flow facts, appending fields to base loans.
- Removes false-positive warnings in `invalidations.cpp` where modifications to one field were incorrectly reported as invalidating iterators/pointers to another field.
- Adds comprehensive unit tests checking nested field access and placeholder fields.
TAG=agy
CONV=2cfd8d00-18d7-4a03-8d78-2aba2f9a8f23
[LifetimeSafety][NFC] Update Checker to use prefix comparison interfaces
This patch switches the Checker's expiry and invalidation checks to use `AccessPath::isPrefixOf` instead of equality (`==`).
Since all generated access paths are currently empty, `isPrefixOf` is behaviorally identical to `==` (NFC). This prepares the checker to handle nested paths (fields and container interiors) in subsequent commits.
TAG=agy
CONV=2cfd8d00-18d7-4a03-8d78-2aba2f9a8f23
[LifetimeSafety][NFC] Refactor AccessPath and Loan representations
This patch refactors the internal representations of `AccessPath` and `Loan` to support path elements, preparing for field-sensitive and interior-sensitive lifetime tracking.
- Introduces `PathElement` representing a field or interior dereference.
- Refactors `AccessPath` to contain a base and a list of `PathElement`s.
- Updates `Loan` and `LoanManager` to use the new `AccessPath` structure.
- Refactors debug dump formatting to output path elements if present.
- Updates Checker and FactsGenerator to compile with the new interfaces, keeping logic behaviorally identical (NFC).
TAG=agy
CONV=2cfd8d00-18d7-4a03-8d78-2aba2f9a8f23
[LifetimeSafety] Add multi-block support to buildOriginFlowChain (#204592)
After introducing `buildOriginFlowChain` to use-after-scope diagnostics,
it should support multi-block analysis. This also allows it to be reused
by other diagnostics.
In some loops, `UseFact` may appear before `OriginFlowFact`:
```cpp
void for_loop_use_before_loop_body(MyObj safe) {
MyObj* p = &safe;
for (int i = 0; i < 1; ++i) {
(void)*p;
MyObj s;
p = &s;
}
(void)*p;
}
```
[6 lines not shown]
[clang][CIR] Add lowering for Neon rounding builtins (#195021)
This PR adds CIR lowering for AArch64 NEON rounding builtins:
- vrnd (trunc), vrnda (round), vrndi (nearbyint), vrndm (floor),
vrndn (roundeven), vrndp (ceil), vrndx (rint)
- vrnd32x, vrnd32z, vrnd64x, vrnd64z (v8.5-a FRINT variants)
The standard rounding builtins lower to the corresponding CIR ops
(cir.trunc, cir.round, etc.). The vrndi_v/vrndiq_v cases are handled
in the common NEON switch since they enter via AArch64SIMDIntrinsicMap
(NEONMAP0). The vrnd32/64 builtins use NEONMAP1 entries with their
aarch64.neon.frint* intrinsic names.
The lowering follows the existing implementation in
CodeGen/TargetBuiltins/ARM.cpp.
Prior to this patch, the original neon-intrinsics.c had zero f32
standard rounding tests :
[6 lines not shown]
[RISC-V][RVY] Initial ISAInfo support for RVY
RVY is a new base ISA, so the syntax to enable it is rv32y/rv64y.
Since the compressed instructions reuse the space for Zcf (RV32) and
Zcd (RV64), those are marked as incompatible and the logic for C/Zce
is updated as part of this PR.
RVY can also extend RVE instead of RVY (as is done for CHERIoT), but the
official arch string syntax for that has not been finalized yet.
Related discussion on that includes the "long base name" proposal:
https://lists.riscv.org/g/tech-unprivileged/message/1134
Reviewers: topperc, lenary, jrtc27
Pull Request: https://github.com/llvm/llvm-project/pull/201931
[VPlan] Move consecutive vector pointer construction to VPBuilder (NFC). (#207563)
Introduce VPBuilder::createConsecutiveVectorPointer to create vector
pointers for consecutive accesses. This enables re-use in follow-up
changes.
[clang-format] Fix BlockIndent compat mapping of AlignAfterOpenBracket (#207187)
be11e2b3d25 (#192283) replaced the `[[fallthrough]]` chain in the
`AlignAfterOpenBracket` backward-compatibility switch with explicit
per-case assignments. In the `BAS_BlockIndent` case the
`BreakBeforeCloseBracket{BracedList,Function,If} = true` assignments are
immediately overwritten with `false`, and
`BreakAfterOpenBracket{BracedList,Function,If}` (previously inherited
from the `BAS_AlwaysBreak` case via fallthrough) are never set. As a
result, `AlignAfterOpenBracket: BlockIndent` parses to the same flag set
as `Align`, silently dropping the block-indent style for existing
configurations.
Restore the pre-#192283 mapping and pin the full BlockIndent flag
mapping in ConfigParseTest so the compat shim cannot regress silently
again.
Fixes #207186.
Note for the release branch: if #205920 (backport of #192283 to
release/22.x) lands, this fix needs to be backported together with it.
[MC] Generate FeatureKV with compact string table (#206331)
FeatureKV is responsible for a fair amount of .data.rel.ro size and
relocations; in an all-target build, this amounts to ~139 kiB that need
to be touched on every startup. Therefore, store strings adjacent to the
SubtargetFeatureKV in memory and reference the strings via relative
offsets to avoid dynamic relocations.