[DirectX][ObjectYAML] Attempt to fix flaky PRIVPart.yaml (#206278)
This test was meant to round-trip YAML twice, to ensure correct
processing of non-4-byte-padded PRIV section.
However, second invocation of yaml2obj had wrong arguments (it was
reading from test file instead of stdin). Fix that.
Also, round-trips were split into several RUN lines, to make it clear on
which line an error occurs if the test is still flaky.
Reland "Make sanitizer special case list slash-agnostic" (#206250)
This changes the glob matcher for the sanitizer special case format so
that it treats `/` as matching both forward and back slashes.
When dealing with cross-compiles or build systems that don't normalize
slashes, it's possible to run into file paths with inconsistent
slashiness, e.g. `../..\v8/include\v8-internal.h` when [building
chromium](https://g-issues.chromium.org/issues/425364464).
We can match this using the current syntax using this ugly kludge:
`src:*{/,\\}v8{/,\\}*`. However, since the format is explicitly for
listing file paths, it makes sense to treat `/` as denoting a path
separator rather than a literal forward slash. This allows us to write
the much more natural form `src:*/v8/*` and have it work on any
platform.
This is technically a behavior change, but it seems very unlikely to
come up in practice. It will only make a difference if a user has a
[16 lines not shown]
[HashRecognize] Rename ByteOrderSwapped to IsBigEndian (NFC) (#206243)
In order to avoid talking about bit-endianness versus byte-endianness,
rename ByteOrderSwapped to IsBigEndian, which is algorithm-agnostic. In
fact, CRC is a bitwise-algorithm, and hence the bit order is reversed.
[VPlan] Skip VPInst where mask is only operand in chain in isUsed (NFC) (#206286)
Update isUsedByLoadStoreAddress o skip VPInstruction where the operand
in the use chain is only used as mask. Those do not contribute to the
load address, so should not force scalarization.
Fixes a regression with f2459f9e
(https://github.com/llvm/llvm-project/pull/196842).
[X86] Add target verifier
Add an X86 TargetVerify and register it by triple so the
TargetVerifierPass dispatches to it for X86 modules. It performs no
checks yet; the subtarget-dependent checks are added in a follow-up.
[Target] Add target-independent TargetVerifier dispatcher
Introduce a target-dependent IR verification framework that can be run
from target-independent locations.
TargetVerify is a base class each backend subclasses to check a function
for constructs that are invalid for a particular target. Backends
register a factory keyed by Triple::ArchType via registerTargetVerify(),
typically from their LLVMInitialize<Target>Target().
TargetVerifierPass (registered as "target-verifier") is the dispatcher:
it reads the module triple and, if a verifier is registered for that
architecture, runs the generic IR verifier followed by the target's
TargetVerify. It is a no-op for targets that have not registered a
verifier, so it is safe to schedule from generic, target-independent
pipelines (e.g. `opt -passes=target-verifier`).
[VPlan] Drop dead CostCtx/Range arg sfrom getScaledReductions (NFC) (#206285)
Neither parameter is referenced; the cost-checking they described moved
to createPartialReductions. Also remove the stale comment.
[LV] Add test for over-eager load scalarization (NFC). (#206283)
Add test showing over-eager load scalarization after f2459f9e
(https://github.com/llvm/llvm-project/pull/196842), when a load is only
used as mask of another load.
[DirectX] Test stripping debug info for DirectX (#206261)
Test that all debug info is stripped after changes in #201336.
Merging this PR to main instead of the stacked #204874.
[ASTMatchers][Docs] make dump_ast_matchers.py read classes from sources (#203784)
With this change, `dump_ast_matchers.py` script no longer need to probe
network to search for classes.
This allows the script to run offline which is needed for
https://github.com/llvm/llvm-project/pull/165472.
The script now operates on assumption that all classes in AST/ will be
here: https://llvm.org/doxygen/ (which is true in general unless doxygen
page is down)
[llubi] Add support for undef values (#205602)
Although we are planning to deprecate the undef value, it is still
widely used in the intermediate results of the pipeline, which blocks
the pass bisection. This patch uses `freeze poison` as a refinement of
undef.
Note that the undef value evaluates to different values each time the
user is executed. So it cannot be cached like other constants. A
temporary buffer is introduced to take ownership of these values and
avoid breaking the interface (although this is a bit ugly...). This will
also be used by a follow-up patch for ptrtoint/inttoptr.
From my experience, it is enough for test case reduction of middle-end
miscompilation bugs (there are still counterexamples like
https://github.com/dtcxzyw/llvm-autoreduce/issues/61). However, when
processing backend miscompilation bugs, lli typically uses a garbage
value, so that llvm-reduce may produce an invalid result. I think we may
need to introduce two flags to migrate this issue: one for poisoning
[5 lines not shown]
[SmallVector] Out-of-line the trivially-copyable push_back grow path (#206213)
In the approximately trivially-copyable specialization, push_back's grow
path does not early return. Both Clang and GCC likely keep `this` and
`Elt` live across the out-of-line `grow_pod` call, saving and restoring
them in the prologue/epilogue. Shrink wrapping can't sink it (the saved
values are used in the store block the fast path also reaches).
Move the grow-and-store into a noinline `growAndPushBack` helper and
tail call it. The fast path needs no callee-saved registers.
`push_back(int)` drops from 14 to 7 instructions on x86-64.
```
// void vec_pb_int(llvm::SmallVectorImpl<int>&v, int x){ v.push_back(x); }
mov eax, dword ptr [rdi + 8]
cmp eax, dword ptr [rdi + 12]
jae _ZN4llvm23SmallVectorTemplateBaseIiLb1EE15growAndPushBackEi # TAILCALL
mov rcx, qword ptr [rdi]
[12 lines not shown]
[BOLT] Work around BSD sed's lack of in-place editing support (#206183)
BSD sed does not implement `-i` the same way as GNU sed. Use a
copy-and-replace approach instead of in-place editing to ensure
compatibility.
[gn] use `sources` instead of `inputs` for libc++ header copy action (#206263)
`sources` and `inputs` have the same semantics for GN action targets,
but the sync script can only handle `sources`.
Follow-up to cd98648925531663.
[libc++][NFC] Mark random_device::__padding_ as [[maybe_unused]] (#206248)
Instead of pushing and popping warnings we can just mark the offending
member as `[[maybe_unused]]`, improving compile times a bit and
simplifying the code.
[AggressiveInstCombine] Factor out the beginning of foldSelectSplitCTTZ/CTLZ into common entry point. NFC (#206220)
Both start by matching a select and a eq/ne compare with 0.
Assisted-by: claude
Reland: [NFC][Support] Implement slash-agnostic path matching in GlobPattern (#206251)
Add a SlashAgnostic option to GlobPattern to allow matching path
separators
(both forward slashes and backslashes) agnostically.
When enabled:
- We conservatively reduce the plain prefix and suffix by treating path
separators as metacharacters. This ensures that path separators are
matched via the slash-agnostic state machine rather than plain string
comparison.
- Brackets containing slashes are adjusted to match both separators.
- Character comparisons in the state machine (matchChar) treat '/' and
'\' as equivalent.
For #149886.
Reland of #202854 incorrectly reverted in #205409.
https://github.com/llvm/llvm-project/pull/202854#issuecomment-4813462549
[3 lines not shown]