[mlir][ODS][NFC] Deduplicate `ref` and `qualified` handling (#91080)
Both the attribute and type format generator and the op format generator
independently implemented the parsing and verification of the `ref` and
`qualified` directives with little to no differences.
This PR moves the implementation of these into the common `FormatParser`
class to deduplicate the implementations.
[LV,LAA] Don't vectorize loops with load and store to invar address.
Code checking stores to invariant addresses and reductions made an
incorrect assumption that the case of both a load & store to the same
invariant address does not need to be handled.
In some cases when vectorizing with runtime checks, there may be
dependences with a load and store to the same address, storing a
reduction value.
Update LAA to separately track if there was a store-store and a
load-store dependence with an invariant addresses.
Bail out early if there as a load-store dependence with invariant
address. If there was a store-store one, still apply the logic checking
if they all store a reduction.
[Transforms] Use StringRef::operator== instead of StringRef::equals (NFC) (#91072)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator==/!= outnumber StringRef::equals by a factor of
31 under llvm/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
Revert "llvm/lib/CodeGen/TargetSchedule.cpp:132:12: warning: Assert statement modifies 'NIter'" (#91079)
Reverts llvm/llvm-project#90982
NIter was only declared in !NDEBUG, and only used for assertions - so it
was correct that it was incremented inside the assertion. (& in fact now
the non-asserts build fails, because the variable is incremented even
though it isn't declared)
[clang][CodeGen] Propagate pragma set fast-math flags to floating point builtins (#90377)
This is a fix for the issue #87758 where fast-math flags are not
propagated all builtins.
It seems like pragmas with fast math flags was only propagated to calls
of unary floating point builtins. This patch propagate them also for
binary and ternary floating point builtins.
[Support] Use StringRef::operator== instead of StringRef::equals (NFC) (#91042)
I'm planning to remove StringRef::equals in favor of
StringRef::operator==.
- StringRef::operator== outnumbers StringRef::equals by a factor of 25
under llvm/ in terms of their usage.
- The elimination of StringRef::equals brings StringRef closer to
std::string_view, which has operator== but not equals.
- S == "foo" is more readable than S.equals("foo"), especially for
!Long.Expression.equals("str") vs Long.Expression != "str".
[libc++] Adjust some of the [rand.dist] critical values that are too strict (#88669)
Adjust some of the [rand.dist] critical values that are too strict
- Most critical values are determined empirically by running each test
51
times with a different PRNG seed and finding the smallest symmetric
interval
around the median that contains 90% of the sample means, variances, etc.
- For the Kolmogorov-Smirnov tests, the alpha=0.1 critical value for
large N
is 1.224/sqrt(N).
- For normally distributed variates, the sample kurtosis is distributed
as
Normal(0, 24/N). For N=1e5, this gives a 90% confidence interval of
0+/-0.0255. For Binomial(40, 0.25), which is approximately normal, the
kurtosis is -0.0167, so the relative 90% CI is large, on the order of
[3 lines not shown]
[DAG] Fold freeze(shuffle(x,y,m)) -> shuffle(freeze(x),freeze(y),m) (#90952)
If the shuffle mask contains no undef elements, then we can move the freeze through a shuffle node.
This requires special case handling to create a new ShuffleVectorSDNode.
Includes VECTOR_SHUFFLE support for isGuaranteedNotToBeUndefOrPoison / canCreateUndefOrPoison.
[MLIR] Extend floating point parsing support (#90442)
Parsing support for floating point types was missing a few features:
1. Parsing floating point attributes from integer literals was supported
only for types with bitwidth smaller or equal to 64.
2. Downstream users could not use `AsmParser::parseFloat` to parse float
types which are printed as integer literals.
This commit addresses both these points. It extends
`Parser::parseFloatFromIntegerLiteral` to support arbitrary bitwidth,
and exposes a new API to parse arbitrary floating point given an
fltSemantics as input. The usage of this new API is introduced in the
Test Dialect.
[BOLT] Fix runOnEachFunctionWithUniqueAllocId (#90039)
When runOnEachFunctionWithUniqueAllocId is invoked with
ForceSequential=true, then the current implementation runs the function
with AllocId==0, which is the Id for the shared, non-unique, default
AnnotationAllocator.
However, the documentation for runOnEachFunctionWithUniqueAllocId
states:
```
/// Perform the work on each BinaryFunction except those that are rejected
/// by SkipPredicate, and create a unique annotation allocator for each
/// task. This should be used whenever the work function creates annotations to
/// allow thread-safe annotation creation.
```
Therefore, even when ForceSequential==true, a unique AllocId should be
used, i.e. different from 0.
[21 lines not shown]
[InstCombine] Do not request non-splat vector support in code reviews (NFC) (#90709)
The InstCombine contributor guide already says:
> Handle non-splat vector constants if doing so is free, but do
> not add handling for them if it adds any additional complexity
> to the code.
This change strengthens this guideline to explicitly discourage
asking (new) contributors to implement non-splat support during code
reviews. Doing so will almost certainly increase the number of
necessary review iterations, or result in outright contradictory review
feedback, as different people are willing to accept a different degree
of complexity for non-splat vector support.
[lld] Error on unsupported split stack (#88063)
Targets with no `-fstack-split` support now emit `ld.lld: error: target
doesn't support split stacks` instead of `UNREACHABLE executed` with a
backtrace asking the user to report a bug.
Resolves #88061
Avoid buffer hoisting from parallel loops (#90735)
This change corrects an invalid behavior in pass
`--buffer-loop-hoisting`. The pass is in charge of extracting buffer
allocations (e.g., `memref.alloca`) from loop regions (e.g., `scf.for`)
when possible. This works OK for looks with sequential execution
semantics. However, a buffer allocated in the body of a parallel loop
may be concurrently accessed by multiple thread to store its local data.
Extracting such buffer from the loop causes all threads to wrongly share
the same memory region.
In the following example, dimension 1 of the input tensor is reversed.
Dimension 0 is traversed with a parallel loop.
```
func.func @f(%input: memref<2x3xf32>) -> memref<2x3xf32> {
%c0 = index.constant 0
%c1 = index.constant 1
%c2 = index.constant 2
[28 lines not shown]