[IndVars] Strengthen inference of samesign flags (#170363)
When reviewing another change, I noticed that we were failing to infer
samsign for two cases: 1) an unsigned comparison, and 2) when both
arguments were known negative.
Using CVP and InstCombine as a reference, we need to be careful to not
allow eq/ne comparisons. I'm a bit unclear on the why of that, and for
now am going with the low risk change. I may return to investigate that
in a follow up.
Compile time results look like noise to me, see:
https://llvm-compile-time-tracker.com/compare.php?from=49a978712893fcf9e5f40ac488315d029cf15d3d&to=2ddb263604fd7d538e09dc1f805ebc30eb3ffab0&stat=instructions:u
[MachineBasicBlock] Don't split loop header successor if the terminator is unanalyzable (#170146)
Fixes https://github.com/llvm/llvm-project/issues/170051
The previous implementation allows splitting the successor if it's the
loop header, regardless of whether the terminator of `this` is
analyzable.
[mlir][xegpu] Change `index` arithmetic ops to `arith` ops. (#170390)
Index ops cause some issues during SIMT distribution because they don't
have the `Elementwise` mappable trait. This PR replaces all index
arithmetic ops with matching `arith` dialect ops.
[SCEV] Factor out utility for proving same sign of two SCEVs [nfc] (#170376)
This is a slightly different API than ConstantRange's
areInsensitiveToSignednessOfICmpPredicate. The only actual difference
(beyond naming) is the handling of empty ranges (i.e. unreachable code).
I wanted to keep the existing SCEV behavior for the unreachable code as
we should be folding that to poison, not reasoning about samesign. I
tried the other variant locally, and saw no test changes.
[LSR] Make OptimizeLoopTermCond able to handle some non-cmp conditions (#165590)
Currently OptimizeLoopTermCond can only convert a cmp instruction to
using a postincrement induction variable, which means it can't handle
predicated loops where the termination condition comes from
get_active_lane_mask. Relax this restriction so that we can handle any
kind of instruction, though only if it's the instruction immediately
before the branch (except for possibly an extractelement).
[Clang] prevent crash on invalid nested name specifiers with a single colon (#169246)
Fixes #167905
---
This patch addresses an issue where invalid nested name specifier
sequences containing a single colon (`a:c::`) could be treated during
recovery as valid scope specifiers, which in turn led to a crash
https://github.com/llvm/llvm-project/blob/c543615744d61e0967b956c402e310946d741570/clang/lib/Parse/ParseExprCXX.cpp#L404-L418
For malformed inputs like `a:c::`, the single colon recovery incorrectly
triggers and produces an `annot_cxxscope`. When tentative parsing later
runs
[9 lines not shown]
[Support] Support debug counters in non-assertion builds (#170468)
This enables the use of debug counters in (non-assertion) release
builds. This is useful to enable debugging without having to switch to
an assertion-enabled build, which may not always be easy.
After some recent improvements, always supporting debug counters no
longer has measurable overhead.
[MemoryBuiltins] Consider index type size when aggregating gep offsets (#132365)
[MemoryBuiltins] Consider index type size when aggregating gep offsets
Main goal here is to fix some bugs seen with LowerConstantIntrinsics
pass and the lowering of llvm.objectsize.
In ObjectSizeOffsetVisitor::computeImpl we are using an external
analysis together with stripAndAccumulateConstantOffsets. The idea
is to compute the Min/Max value of individual offsets within a GEP.
The bug solved here is that when doing the Min/Max comparisons the
external analysis wasn't considering the index type size (given by
the data layout), it was simply using the type from the IR. Since a
GEP is defined as sext/truncating indices we need to consider the
index type size in the external analysis.
This solves a regression (false ubsan warnings) seen after commit
https://github.com/llvm/llvm-project/commit/02b8ee281947f6cb39c7eb3c4bbba59322e9015b
(https://github.com/llvm/llvm-project/pull/117849).
[Clang] adjust caret placement for the suggested attribute location for enum class (#168092)
Fixes #163224
---
This patch addresses the issue by correcting the caret insertion
location for attributes incorrectly positioned before an enum. The
location is now derived from the associated `EnumDecl`: for named enums,
the attribute is placed before the identifier, while for anonymous enum
definitions, it is placed before the opening brace, with a fallback to
the semicolon when no brace is present.
For example:
```cpp
[[nodiscard]] enum class E1 {};
```
[4 lines not shown]
[clang-format] Ignore C++ keywords when formatting Verilog (#167984)
In the sample below, the `private` identifier is the name of the type,
and the `try` identifier is the name of the variable.
new
```SystemVerilog
begin
private try;
end
```
old
```SystemVerilog
begin
private
try
[3 lines not shown]
Fix lit testing to support standalone testing (#170365)
To be able to test lit without having a configuration of LLVM, we need
to support invocations that are not going through the lit.site.cfg and
thus don't have a llvm_config set-up.
[flang][OpenMP] Rename OmpLoopRangeClause to OmpLooprangeClause, NFC (#170370)
The convention is to change spelling from snake_case to UpperCamel, and
use the result as a stem in derived names, e.g.
- spelling is "some_clause" -> stem is SomeClause
- spelling is "someclause" -> stem is Someclause
Member of the OmpClause variant is <stem> itself, e.g. Looprange as in
parser::OmpClause::Looprange.
Specific clause class name is Omp<stem>Clause, e.g. OmpLooprangeClause.
[flang] Support kind/index lookup inside of EQUIVALENCE (#170056)
Turn off "in EQUIVALENCE" check for processing of array subscripts,
since subscripts themselves are not part of the EQUIVALENCE.
Fixes #169590
[AArch64] Add bitcasts for lowering saturating add/sub and shift intrinsics. (#161840)
This is followup patch to #157680 . In this patch, we are adding
explicit bitcasts to floating-point type when lowering saturating
add/sub and shift NEON scalar intrinsics using SelectionDAG, so they can
be picked up by patterns added in first part of this series. To do that,
we have to create new nodes for these intrinsics, which operate on
floating-point types and wrap them in bitcast nodes.
[RISCV] Fix corner cases after #170070 (#170438)
There are two fixes:
1. Clear kill flags for `FalseReg` in foldVMergeToMask or we can't
pass the MachineVerifier because of using a killed virtual register.
2. Restrict `lookThruCopies` to only look through COPYs with
one non-debug use.
This was found when backporting #170070 to 21.x branch.
[ValueTracking] Support scalable vector splats in computeKnownBits (#170345)
Similar to https://github.com/llvm/llvm-project/pull/170325, this patch
adds support for scalable vector splats in computeKnownBits.
[Clang] Fix `PPChainedCallbacks::EmbedFileNotFound()` (#170293)
We've had internal test failures since #166188 landed. The root cause is
that `PPChainedCallbacks::EmbedFileNotFound()` incorrectly calls
`PPCallbacks::FileNotFound()` not `PPCallbacks::EmbedFileNotFound()`.