[InstCombine] Fold lshr 1, X into zext (X == 0) (#200669)
This PR implements the missed optimisation reported in #200538.
`1 >> X` produces 1 only when X == 0, and 0 for all other in-range
values. Fold it directly into `zext (icmp eq X, 0)`.
[AArch64][SME] Add multi-vector load opcodes to getMemOpInfo (#200238)
We recently started emitting these in
84fab943b5740ec273e9f8d238ea8420033320a4, which now means we can hit an
unhandled opcode error in AArch64InstrInfo::getMemOpInfo when resolving
stack offsets.
Fixes #200034
[M68k][MC] Add MC support for PCI w/ base displacement addressing mode (#200696)
Program Counter Indirect with Index (PCI) is augmented in M68020+ with
(1) larger displacement (up to 32-bit), and (2) Index scaling factor. We
call this PCIBD (PCI with Base Displacement) to distinguish it with the
older PCI.
Since all the components inside PCIBD are optional, including index
register, we can actually use it to replace PCD (PC displacement)
addressing mode in newer machines in order to leverage the larger
displacement.
This is the first step to support 32-bit memory addresses on M68020+
machines.
[AArch64][SVE] Handle multi-vector load/store opcodes in frame-index elimination
Lowering a wide scalable load from a stack object produces an
LD1*_{2Z,4Z}_IMM[_PSEUDO] with a frame-index base. getMemOpInfo() and getLoadStoreImmIdx()
had no entries for these SME2/SVE2p1 multi-vector opcodes, so PEI crasheds.
[clang-format] Recognize Verilog class item qualifiers (#199085)
old
```SystemVerilog
class Packet
extern protected virtual function int send
(int value);
endclass : Packet
```
new
```SystemVerilog
class Packet
extern protected virtual function int send
(int value);
endclass : Packet
```
[3 lines not shown]
[clang-format] Remove the blank line in the function try block (#199086)
old with config `{SeparateDefinitionBlocks: Always}`
```C++
void foo() try {
// do something
} catch (const std::exception &e) {
// handle exception
}
```
new
```C++
void foo() try {
// do something
} catch (const std::exception &e) {
[7 lines not shown]
[ARM] Fix some fp16 Shuffle lowering without +fullfp16 (#200688)
Without fullfp16 f16 is not a legal type, meaning we need to be careful
with
how we legalize shuffle vector and buildvector operations that cannot be
treated more optimially using shuffles.
[AMDGPU] Use v_rsq_f32 for f16 rsqrt on targets without 16-bit insts (#200646)
On gfx6/gfx7 the f16 1.0/sqrt(x) pattern was not folded to a reciprocal
square root because performFDivCombine bailed out whenever f16 fsqrt was
not a legal operation. f16 fsqrt is Custom (promoted) on these targets,
so the combine never fired and the full f32 fdiv expansion was emitted.
Split the legality check: when same-type fsqrt is legal (gfx8+), keep
emitting the native rsq. For f16 without a legal fsqrt, compute the
reciprocal square root in f32 with v_rsq_f32 and round back. This is
accurate enough for f16, and needs no denormal scaling because every f16
value extends to a normal f32 and an f16 rsq result is never denormal.
bf16 is intentionally left expanded: it shares f32's exponent range, so
bf16 denormals would extend to f32 denormals that v_rsq_f32 does not
handle.
Fixes #76948
Co-authored-by: Claude Opus 4.8 <noreply at anthropic.com>
[NFC][LLVM] Fix Intrinsics.td to adhere to 80 col limit (#199346)
Verified that there is no difference in the tablegen generated files for
intrinsics except line number changes in the comments in
IntrinsicEnums.inc.
[NFC][LLVM] Remove redundant verifier type checks for some intrinsics (#200658)
Remove the following redundant type checks:
* `[s|u]div_fix*` intrinsics, existing checks in `isSignatureValid` will
verify that arg0 and arg1 are int or int vectors (since they use
`llvm_anyint_ty`) and arg2 is declared as i32, so checks related to it
are also redundant.
* For `lrint` family, the result is `llvm_anyfloat_ty` and the argument
is `llvm_anyint_ty`, so one of the checks is redundant.
[TailCallElim] Drop poison-generating flags on reassociated accumulators (#200624)
For example if you have recursion like
int prod(n) {
if (n == 0) return 1;
return prod(n-1) * f(n)
}
then logically this computes (((f(1) * f(2)) * f(3)) * f(4)) * ... f(n).
But TailCallElim reassociates this, computing instead
((f(n) * f(n-1)) * f(n-2)) * ...
If the operator (* in this case) had poison-generating flags like
nsw, those may not still apply after reassociation. (For example,
suppose in this example f(1) returns 0 -- in that case the original
multiplication cannot overflow, but the new one still might.)
Fix this by clearing the poison-generating flags after reassociating.
[TableGen] Add !switch operator (#199659)
This patch add a syntactic sugar operator to TableGen named `!switch`,
to simplify use cases where a user needs to conditionally use a value
based on exact key match. It supports variadic case arguments (0 or
more). It requires a default value - which creates a stricter grammar
that is simpler to parse, and I think the flexibility cost is not real -
it is considered a best practice in SW design for switch expressions (or
statements) on arbitrary types to always provide a default.
At parse time, after key and value type-checking, we reduce the
`!switch` expression to `!cond`, as they effectively entirely share the
downstream logic. The impl also extracts a shared pre-reduction
type-checking for `!switch` and `!cond` called
`TGParser::resolveInitTypes`.
Motivation: switch-behaving `!cond` value selection in `llvm/lib/Target`
e.g. from `llvm/lib/Target/AArch64/AArch64InstrFormats.td`:
```
[11 lines not shown]
[clang-tidy] `use-ranges`: preserve iterator results with `.begin()` (#196036)
Preserve used iterator results for `remove`, `partition`,
`stable_partition`, and `rotate`-style replacements by appending
`.begin()` where the ranges algorithm returns a subrange.
Fix #124794
Assisted by Codex.
[lit] Add --check to run only selected RUN lines from a test
`llvm-lit --check=LIST <test>` keeps only the listed RUN directives in
the test and discards the rest. LIST is a comma-separated mix
of 0-indexed integers and ranges (e.g. `--check=0,2,4-6`). The
selection is applied to the parseIntegratedTestScript output.
Run tests via
`llvm-lit --check=0 llvm/utils/lit/tests/Inputs/check-filter/sample.ll`,
`llvm-lit --check=1 llvm/utils/lit/tests/Inputs/check-filter/sample.ll`,
`llvm/utils/lit/lit.py llvm/utils/lit/tests/check-filter.py`.
[clang-tidy] Avoid unsafe `use-default-member-init` fixes (#191607)
Suppress `modernize-use-default-member-init` diagnostics when moving a
constructor initializer into a default member initializer would
reference a declaration not visible from the field declaration.
Add `IgnoreNonVisibleReferences` to allow preserving the warning without
emitting unsafe fix-its, and document the new behavior.
Fixes #156412
Assisted by Codex
[clang-tidy] `use-ranges`: avoid unsafe result fix-its
Preserve callable results with .fun, allow structured-binding-safe rewrites, and keep diagnostics while suppressing unsafe fix-its when ranges result objects do not match the original result shape.
Assisted by Codex.
[clang-tidy] `use-ranges`: preserve output results
Preserve used output iterator results for output algorithm replacements by appending .out where the ranges algorithm returns an algorithm result object.
Fix #110223
Assisted by Codex.
[clang-tidy] `use-ranges`: preserve remove iterator results
Preserve used iterator results for remove, partition, stable_partition, and rotate-style replacements by appending .begin() where the ranges algorithm returns a subrange.
Fix #124794
Assisted by Codex.