[clang][bytecode][NFC] Add FullExpression scopes (#170705)
And use them instead of the extending decl. This is close to what the
current interpreter is doing.
This is NFC right now but fixes a problem I encountered while looking
into the expansion statement stuff.
[IR] Fix vector.splice verifier scaling by vscale for fixed length vectors (#170807)
Currently we multiply the known minimum number of elements by vscale
even if the vector in question is fixed, so sometimes we miss some fixed
vectors with out of bounds indices.
[mlir:python] Add manual typing annotations to `mlir.register_*` functions. (#170627)
This PR adds a manual typing annotations to the `register_operation` and
`register_(type|value)_caster` functions in the main `mlir` module.
Since those functions return the result `nb::cpp_function`, which is of
type `nb::object`, the automatic typing annocations are of the form `def
f() -> object`. This isn't particularly precise and leads to type
checking errors when the functions are used. Manually defining the
annotation with `nb::sig` solves the problem.
Signed-off-by: Ingo Müller <ingomueller at google.com>
[DAG] Fold mul 0 -> 0 when expanding mul into parts. (#168780)
If the upper bits are zero, but we expand multiply then immediately
convert the multiple into a libcall, there is no opportunity to optimize
away the mul. Do so in getNode to make sure extending multiplies
optimise cleanly.
[ORC] Add void function support to CallViaEPC, CallSPSViaEPC. (#170800)
Adds support for calling void functions. Calls to void functions return
Error to capture any IPC/RPC failure.
[Clang] Extend __builtin_counted_by_ref to support pointers with 'counted_by' (#170750)
The __builtin_counted_by_ref builtin was previously limited to flexible
array members (FAMs). This change extends it to also support pointer
members that have the 'counted_by' attribute.
The 'counted_by' attribute can be applied to both FAMs and pointer
members:
struct fam_struct {
int count;
int array[] __attribute__((counted_by(count)));
};
struct ptr_struct {
int count;
int *buf __attribute__((counted_by(count)));
};
[24 lines not shown]
Adding Matching and Inference Functionality to Propeller-PR4: Implement matching and inference and create clusters (#167622)
This PR re-submits the previously reverted
PR(https://github.com/llvm/llvm-project/pull/165868) and fixes the
return type mismatch error.
Co-authored-by: lifengxiang1025 <lifengxiang at kuaishou.com>
Co-authored-by: zcfh <wuminghui03 at kuaishou.com>
[llvm][mlir][OpenMP] Support translation for linear clause in omp.wsloop and omp.simd (#139386)
This patch adds support for LLVM translation of linear clause on
omp.wsloop (except for linear modifiers).
[RISCV][llvm] Support VFADD, VFSUB, VFMUL codegen for Zvfbfa (#170612)
Support both fixed-length vectors and scalable vectors.
Note: VP version is not gonna be supported for trivial instructions
since they're going to be removed soon.
[RISCV][llvm] Support PSLL codegen for P extension (#170074)
There's no instruciton for vector shift amount, so we have to scalarize
it and rebuild the vector.
[RISCV] Inserting indirect jumps with X7 for Zicfilp (#170683)
`BranchRelxation` uses `RISCVInstrInfo::insertIndirectBranch` to insert
an indirect branch if the jump target is out of range. Currently it uses
register scavenging to find a free register to use for the indirect
target. If Zicfilp is enabled, we need to use X7 so that the jump will
be treated as a software guarded branch.
Co-authored-by: Yeting Kuo <46629943+yetingk at users.noreply.github.com>
[MLIR] Add fusability query to TilingInterface (#166502)
This introduces `isOpFusableWithProducer/Consumer` methods to the
TilingInterface that enable querying whether a tilable op can be fused
into a given set of producer slices or consumer slice without generating
IR. This is needed to enable use of the tiling interface in pattern
rewrites, as without this any pattern rewrite that tries to invoke the
method to tile is allowed to generate IR and fail.
[libclc] Add OpenCL atomic_*_explicit builtins (#168318)
Implement atomic_*_explicit (e.g. atomic_store_explicit) with
memory_order plus optional memory_scope.
OpenCL memory_order maps 1:1 to Clang (e.g. OpenCL memory_order_relaxed
== Clang __ATOMIC_RELAXED), so we pass it unchanged to clc_atomic_*
function which forwards to Clang _scoped_atomic* builtins.
Other changes:
* Add __opencl_get_clang_memory_scope helper in opencl/utils.h (OpenCL
scope -> Clang scope).
* Correct atomic_compare_exchange return type to bool.
* Fix atomic_compare_exchange to return true when value stored in the
pointer equals expected value.
* Remove volatile from CLC functions so that volatile isn't present in
LLVM IR.
* Add '-fdeclare-opencl-builtins -finclude-default-header' flag to
include
[3 lines not shown]