[X86] Replace custom minmax reduction pattern matching with ISD::VEC_REDUCE_*MIN/MAX support (#194473)
Support middle-end reduction integer min/max patterns instead of relying
on the ExpandReductions pass and then matching the expanded pattern in
DAG.
Middle-end reduction pattern recognition now matches
SelectionDAG::matchBinOpReduction (inc partial reduction), and its
better to improve future handling in InstCombine/VectorCombine wherever
possible.
Fixes #194624
Revert "[AArch64] Use ADDP tree for v16i8 to i16 bitmask extraction (#192974)" (#198985)
This reverts commit 42cc9b53bf83ebb778755fd51a9b277cb71740d5 as it is
tripping up on type legalization.
[mlir] only verify moved symbols in transform (#197882)
When merging named sequences from an external module in the transform
interpreter, only run the inliner verification for operations that were
actually moved rather than all pre-existing operations. This avoids
verifying inlining conditions for operations that wouldn't be inlined by
this logic, and is also more parsimonious.
Reverts #195770 but keeps the test. This is a more generic fix.
[AArch64][GlobalISel] Do not clamp s16 G_FCONSTANT. (#198983)
This should be an NFC as s16 FCONSTANT is already legal and handled
later
whether fp16 is available or not.
[libc] Make FreeListHeap::free ignore null pointers. (#198834)
Update `FreeListHeap::free` to return immediately when passed `nullptr`.
This matches the expected `free(nullptr)` behaviour as per the C
standard and avoids running pointer validation on a null pointer
allowing the hermetic malloc test libc/test/src/stdlib/malloc_test.cpp
to pass.
As per standard:
The free function, paragraph:
```
The free function causes the space pointed to by
ptr to be deallocated, that is, made available for
further allocation. If ptr is a null pointer,
no action occurs.
```
[WebAssembly] Cost model for F32 memory interleave (#198531)
Set them the same as their i32 counterparts, but don't cost 4x v4f32.
wasm-perf doesn't show any particular uplift and the tests show that,
when we're mixing integers and floats, we often interleaving anyway. But
this change should be good for arithmetic on structures of 2xf32.
remove redundant uses of `isa` caught by clang-tidy (NFC) (#192813)
These calls to `isa` are always true. Also includes a drive-by cleanup
of a use of `isa_and_nonnull` where the value was already null-checked.
Caught by applying https://github.com/llvm/llvm-project/pull/191081
[clang-tidy] detect uses of llvm::isa that are always true (#191081)
Warns when performing a dynamic type check that is always true, either
because the dynamic type is the same as the static type, or because the
static type derives the dynamic type.
Supported functions:
- isa
- isa_and_present
- isa_and_nonnull
Related PR: https://github.com/llvm/llvm-project/pull/189274
[libc][NFC] Lowercase standard identifiers in YAML files (#198854)
Update YAML files to use lowercase identifiers for standards.
In header.py, canonical identifiers for standards are explicitly defined
in lowercase and mapped to their pretty names for display. This change
ensures that all YAML files use the lowercase identifiers (posix, linux,
bsd, gnu) expected by the header generation tool.
Assisted-by: Automated tooling, human reviewed.
[ARM] Copy all flags when creating LDM (#197898)
This just adds the Operand instead of trying to handle flags
individually, similar to the AArch64LoadStoreOptimizer.
Fixes #196779
[compiler-rt][builtins] A few fixes cpu_model files (#198957)
- Fix typo in include guard with the word features
- Correct header in cpu_model.h header file and include guard after
the file has been renamed
[LoongArch] Fix musttail with indirect arguments by forwarding incoming pointers
When a `musttail` call passes arguments indirectly (fp128 on LA32, i128
on LA32), the backend allocates a stack temporary and hands the callee a
pointer. The tail call deallocates the caller's frame, and the pointer
dangles.
Fix by forwarding the incoming indirect pointers instead. They point to
the caller's caller's frame, which stays valid after the tail call.
Forwarded formal parameters reuse the pointer directly; computed values
get stored into the incoming buffer first.
The pointers are saved in virtual registers (`CopyToReg`/`CopyFromReg`)
rather than SDValues. The SelectionDAG is cleared between basic blocks
and musttail calls can appear in non-entry blocks, so storing raw
SDValues across BBs is unsound (this was the bug that led to the revert
in 501417baa60f). The vreg save only fires when the function has
musttail calls; other functions see no codegen change.
[3 lines not shown]
[RISCV] Add TuneJumpIsExpensive (#191374)
We had `setJumpIsExpensive(true)` before 18.x but it was removed
in #74647. This feature allows users to tune the ISel behavior.
We have #80124 and #178394 landed, so it should be more flexible
to tune branches and selects now.
This is an alternative of #191158.
[RISCV][P-ext] Custom-lower SELECT for v4i16/v8i8 on RV32 (#198723)
SELECT was Expand for RV32 64-bit packed types, producing 40-80 lines of
stack-based per-element scalarization. Make it Custom for v4i16/v8i8 and
extend the existing isPExtPackedType branch in lowerSELECT to bitcast to
an integer of matching width: single-GPR types select on XLenVT
directly, while RV32 double-wide types select on i64 which
type-legalizes to two scalar selects on the i32 halves.
v2i32 is left to natural type-legalization since it splits cleanly into
two scalar i32 selects without a Custom hook.
[NFC][ASan] Factor out ASan call insertion behind a single call (#198650)
The ASan pass directly injects function calls into the IR using
getOrInsertFunction() on every call site. Refactor the disparate call
sites behind an AsanFunctionInserter class. This allows us to add pre-
and post-processing logic for all inserted functions at once.
Signed-off-by: Emil Tsalapatis <emil at etsalapatis.com>