DAG: Remove softPromoteHalfType
Remove the now unimplemented target hook and associated DAG machinery
for the old half legalization path.
Really fixes #97975
R600: Remove softPromoteHalfType
Also includes a kind of hacky, minimal change to avoid assertions
when softPromoteHalfType is removed to fix kernel arguments
lowered as f16. Half support was never really implemented
for r600, and there just happened to be a few incidental tests
which included a half argument (which were also not even meaningful,
since the function body just folded to nothing due to no callable
function support).
AMDGPU: Move softPromoteHalfType override to R600 only
As expected the code is much worse, but more correct.
We could do a better job with source modifier management around
fp16_to_fp/fp_to_fp16.
AMDGPU/GlobalISel: Regbanklegalize rules for G_UNMERGE_VALUES
Move G_UNMERGE_VALUES handling to AMDGPURegBankLegalizeRules.cpp.
Fix sgpr S16 unmerge by lowering using shift and using S32.
Previously sgpr S16 unmerge was selected using _lo16 and _hi16 subreg
indexes which are exclusive to vgpr register classes.
For remaing cases we do trivial mapping, assigns same reg bank
to all operands, vgpr or sgpr.
[mlir][bufferization] Cache SymbolTableCollection for CallOp types (#176909)
Use the BufferizationState symbol table cache when resolving CallOp
callee types in getBufferType(), avoiding repeated SymbolTableCollection
creation. Add a const accessor (backed by a mutable cache) so const
state can reuse the same tables. Completes a marked TODO.
xdr_string: don't leak strings with xdr_free
Historically (and in a small amount of older software such as OpenAFS),
developers would attempt to free XDR strings with
xdr_free((xdrproc_t)xdr_string, &string)
This resulted in xdr_free calling xdr_string with only two intentional
arguments and whatever was left in the third argument register. If the
register held a sufficently small number, xdr_string would return FALSE
and not free the string (no one checks the return values).
Software should instead free strings with:
xdr_free((xdrproc_t)xdr_wrapstring, &string)
Because buggy software exists in the wild, act as though xdr_wrapstring
was used in the XDR_FREE case and plug these leaks.
[5 lines not shown]
rpc/xdr.h: make xdrproc_t always take two arguments
The type of xdrproc_t is clearly defined in the comments as a function
with two arguments, an XDR * and a void * (sometimes spelled caddr_t).
It was initialy defined as:
typedef bool_t (*xdrproc_t)();
At some point people started giving it a non-empty argument list.
Unfortunatly, there has been widespread disagreement about how arguments
are passed. There seems to have been a widespread view that it should
be allowed to pass three argument function pointer to xdrproc_t. Most
notable is xdr_string which takes a maximum length parameter. This lead
to all sorts of prototypes (all of which have been present in the
FreeBSD source tree):
FreeBSD userspace (nominally from tirpc, but seemingly local):
typedef bool_t (*xdrproc_t)(XDR *, ...);
FreeBSD kernel, glibc:
[36 lines not shown]
firewall: lowercase for protocol values required for DNAT
Use the ChangeCase BaseField extension because it's already being tested
and add more tests and safeguards so that the cache knows which case is
going on (also if 'any' needs to be used).
The 'any' value is a bit tricky here. Force it to lowercase in all cases
since it wasn't uppercased before either.
Also fix the display of anti-lockout protocol for consistency.
[clang][bytecode] Finish support for `msvc::constexpr` (#177388)
Keep track of whether an `InterpFrame` is allowed to call
`msvc::constexpr` functions via two new opcodes.
firewall: lowercase for protocol values required for DNAT
Use the ChangeCase BaseField extension because it's already being tested
and add more tests and safeguards so that the cache knows which case is
going on (also if 'any' needs to be used).
[clang][bytecode][NFC] Move some opcode impls to the source file (#177543)
They aren't templated, so move them to Interp.cpp to make the header
file a bit shorter.
[mlir][spirv] Add Conv operations for TOSA Extended Instruction Set (001000.1) (#176908)
This patch expands support for the TOSA Extended Instruction Set
(001000.1) to the SPIR-V dialect in MLIR. The TOSA extended instruction
set provides a standardized set of machine learning operations designed
to be used within `spirv.ARM.Graph` operations (corresponding to
OpGraphARM in SPV_ARM_graph) and typed with `!spirv.arm.tensor<...>`
(corresponding to OpTypeTensorARM in SPV_ARM_tensor).
The change introduces:
* Extending dialect plumbing for import, serialization, and
deserialization of the TOSA extended instruction set.
* The `spirv.Tosa.*Conv*` convolution operation from TOSA extended
instruction, each lowering to the corresponding `OpExtInst`.
* Verification enforcing that new convolution operations appears only
within `spirv.ARM.Graph` regions, operates on `!spirv.arm.tensor<...>`
types, and is well-formed according to the TOSA 001000.1 specification.
All convolution operations from TOSA 001000.1 extended instructions are
[11 lines not shown]
[X86] Enable custom lowering of 256/512-bit vXi32 and vXi64 CLMUL nodes (#177554)
Similar to 128-bit v4i32/v2i64 support, these are can now be efficiently
lowered to PCLMUL nodes through unrolling, shuffle combining and
concatenation
If the target only supports PCLMUL then they will remain as 128-bit
nodes, but if VPCLMULQDQ is supported then they should merge into wider
types.