[SLP] Fix GEP cost computation for load vectorization cost estimates
Pass Instruction::Load instead of Instruction::GetElementPtr to
getGEPCosts in isMaskedLoadCompress and CheckForShuffledLoads.
These call sites estimate costs for wide contiguous loads and sub-vector
load patterns, not for masked gather pointer vector formation. Using
Instruction::GetElementPtr incorrectly triggered the gather-style cost
path, which computes vector GEP formation costs. Since the call sites
already add scalarization overhead for pointer vector building
separately, this led to double-counting of pointer costs and inaccurate
vectorization decisions.
Reviewers: hiraditya, RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/191728
[ORC] Move MemoryAccess ownership out of ExecutorProcessControl. (#191715)
Similar to the DylibManager change in e55fb5de0f9, this removes an
unnecessary coupling between ExecutorProcessControl and MemoryAccess,
allowing clients to select MemoryAccess implementations independently.
To simplify the transition, the
ExecutorProcessControl::createDefaultMemoryAccess method will return an
instance of whatever MemoryAccess the ExecutorProcessControl
implementation had been using previously.
ValueTracking: Handle frexp exp in computeKnownConstantRange
Compute the bounds based on the known exponent range.
Only handles IEEE cases since I don't see an easy way to
get the bounds in general.
Test uses instcombine instead of checking for the range
attribute, since apparently attributor doesn't handle introducing
range attributes from computeConstantRange.
[SLP]Fix dominance failure when value is copyable in one PHI entry but non-copyable in another
When a value is treated as a copyable element in one tree entry and as a
non-copyable element in another, both feeding into PHI nodes, the
scheduler could produce vectorized IR where an instruction does not
dominate all its uses. Bail out of scheduling in tryScheduleBundle when
this conflict is detected to prevent generating broken modules.
Fixes #191714
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/191724
[flang][OpenMP] Identify DO loops affected by loop-associated construct
This is to identify iteration variables of DO loops affected by an OpenMP
loop construct. These variables are privatized as per data-sharing rules.
[SLP] Fix GEP cost computation for load vectorization cost estimates
Pass Instruction::Load instead of Instruction::GetElementPtr to
getGEPCosts in isMaskedLoadCompress and CheckForShuffledLoads.
These call sites estimate costs for wide contiguous loads and sub-vector
load patterns, not for masked gather pointer vector formation. Using
Instruction::GetElementPtr incorrectly triggered the gather-style cost
path, which computes vector GEP formation costs. Since the call sites
already add scalarization overhead for pointer vector building
separately, this led to double-counting of pointer costs and inaccurate
vectorization decisions.
Reviewers: hiraditya, RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/191620
[libc++][numeric] P4052R0: Renaming saturation arithmetic functions (#189574)
Implements P4052R0.
Also renames:
- the internal names for consistency.
- test files (no changes to the contents but the function names).
Fixes: #189589
---------
Co-authored-by: A. Jiang <de34 at live.cn>
[AMDGPU][Scheduler] Use MIR-level rematerializer in rematerialization stage (#189491)
This makes the scheduler's rematerialization stage use the
target-independent rematerializer. Previously duplicate logic is
deleted, and restrictions are put in place in the stage so that the same
constraints as before apply on rematerializable registers (as the
rematerializer is able to expose many more rematerialization
opportunities than what the stage can track at the moment). Consequently
it is not expected that this change improves performance overall, but it
is a first step toward being able to use the rematerializer's more
advanced capabilities during scheduling.
This is *not* a NFC for 2 reasons.
- Score equalities between two rematerialization candidates with
otherwise equivalent score are decided by their corresponding register's
index handle in the rematerializer (previously the pointer to their
state object's value). This is determined by the rematerializer's
register collection order, which is different from the stage's old
[12 lines not shown]
[flang][OpenMP] Implement GetGeneratedNestDepthWithReason
For a loop-nest-generating construct this function returns the number of
loops in the generated loop nest.
A loop-nest-transformation construct can be thought as replacing N nested
loops with K nested loops, where
N = GetAffectedNestDepthWithReason
K = GetGeneratedNestDepthWithReason
[JITLink] Use NonOwningSymbolStringPtrs in ExternalSymbolsMap. (#191634)
SymbolStringPtr comparisons should be more efficient that string
comparisons. Fixes a FIXME.
[llvm-readobj][ELF] Use WrappedError to filter duplicates
Switch from StringError to WrappedError. Errors of the form "Prefix:
Error" can now be filtered out based on the underlying error while
preserving distinct prefixes, resulting in clearer llvm-readobj output.
[Object][ELF] Pass Error to WarningHandler
Warning consumers may need to handle errors based on their type. Pass
the Error object instead of a string representation to enable this. This
also brings WarningHandler in line with Support/WithColor.h.