AMDGPU: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3
Codegen for v_dual_dot2acc_f32_f16/bf16 for targets that only have VOP3
version of the instruction.
Since there is no VOP2 version, instroduce temporary mir DOT2ACC pseudo
that is selected when there are no src_modifiers. This DOT2ACC pseudo
has src2 tied to dst (like the VOP2 version), PostRA pseudo expansion will
restore pseudo to VOP3 version of the instruction.
CreateVOPD will recoginize such VOP3 pseudo and generate v_dual_dot2acc.
[SLP]Disable modeling disjoint reduction or as bitcast for big endian
Big endian targets cannot be modeled as bitcast, need to support it as
a reversion/bswap instead, just disabling it for now.
[libc++] Fix gdb pretty printer for strings (#176882)
The gdb pretty printer for strings reports an error when printing a
string that is small enough to fit inline in the string object. The
problem is that the lazy_string method can't be applied directly to an
array value. The fix is to cast the array to a pointer and apply
lazy_string to that value.
graphics/glslang: Update to 16.2.0
While here, add "shared" flavor for installing shared
libraries. (Default is static.)
Changelog: https://github.com/KhronosGroup/glslang/blob/16.2.0/CHANGES.md
PR: 292737
Reported by: Eric Camachat <eric at camachat.org>,
vvd
Co-authored-by: Eric Camachat <eric at camachat.org>
AMDGPU: Cleanup the handling of flags in getTgtMemIntrinsic
Some of the flag handling seems a bit inconsistent and dodgy, but this
is meant to be a pure refactoring for now.
commit-id:99911619
hwpstate_amd(4): Fix BITS_WITH_VALUE()/SET_BITS_VALUE() to obey the mask
While here, rename an argument of BITS_VALUE() to be consistent with the
other macros.
Reviewed by: aokblast
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54997
hwpstate_amd(4): Rename CPPC register macros
To be closer to AMD's official terminology, except for the "Lowest
Non-Linear Performance" field which we label as 'EFFICIENT_PERF' closer
to Intel's ("Most Efficient Performance"), and to clear possible
confusion.
No functional change (intended).
Reviewed by: aokblast
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54998
x86: x86_msr_op(): Simplify assertions
Simplify them by moving them into more natural places, i.e., default
cases of 'switch' statements.
No functional change (intended).
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54996
x86: x86_msr_op(): MSR_OP_LOCAL: Disable interrupts on atomic ops
On MSR_OP_LOCAL and non-naturally-atomic operations (MSR_OP_ANDNOT and
MSR_OP_OR), there is no guarantee that we are not interrupted between
reading and writing the MSR, and that interruption could actually
perform some operation on that MSR, which would be lost.
Prevent that problem by temporarily disabling interrupts around MSR
manipulation.
Reviewed by: kib
Discussed with: markj
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D54996
[AMDGPU] Implement llvm.sponentry (#176357)
In some of our use cases, the GPU runtime stores some data at the top of
the stack. It figures out where it's safe to store it by using the PAL
metadata generated by the backend, which includes the total stack size.
However, the metadata does not include the space reserved at the bottom
of the stack for the trap handler when CWSR is enabled in dynamic VGPR
mode. This space is reserved dynamically based on whether or not the
code is running on the compute queue. Therefore, the runtime needs a way
to take that into account.
Add support for `llvm.sponentry`, which should return the base of the
stack,
skipping over any reserved areas. This allows us to keep this
computation in
one place rather than duplicate it between the backend and the runtime.
The implementation for functions that set up their own stack uses a
pseudo
[8 lines not shown]
[mlir][spirv] Add Activation operators to TOSA Extended Instruction S… (#178620)
…et (001000.1)
In details the Activation operators introduced are:
spirv.Tosa.{Clamp,Erf,Sigmoid,Tanh}, along with dialect and
serialization round-trip tests.
Signed-off-by: Davide Grohmann <davide.grohmann at arm.com>