Revert "AMDGPU: Codegen for v_dual_dot2acc_f32_f16/bf16 from VOP3"
This reverts commit 47f6a19181b426baa03182ab6a7a41e16b35301d.
Breaks MIOpen, don't have propper fix yet.
NAS-140509 / 27.0.0-BETA.1 / fix test_legacy_api test failures (#18620)
The commit in ad71fbed02 caused this test to start failing. Fix the test
to account for the changes in that commit.
NAS-140510 / 26.0.0-BETA.2 / fix creating containers (by yocalebo) (#18622)
All of the IDMAP related container API tests began to fail. These
started failing after commit 7711549fbd which made creating a container
fail if it can't set the hostname. That's entirely inappropriate since
setting a hostname on a container should be non-fatal.
Furthermore, there is no need to fork+exec to nsenter to set the
hostname. The init process of a container will read /etc/hostname and
/etc/hosts during boot.
With these changes, a container is created and the hostname is properly
set to the containers name and I can execute "sudo" commands without
warnings...
```
# hostname
calebtest123
# sudo true
# echo $?
[4 lines not shown]
[clang] Fix array filler lowering for _BitInt arrays (#189954)
Sometimes we use array of bytes to represent `_BitInt` types in memory.
When this is the case the lowered array filler expression reaches
`ConstantEmitter::emitForMemory` already with memory type which will be
array of i8 instead of a single iN, so `cast<llvm::ConstantInt>` was
failing within `ConstantEmitter::emitForMemory`. This patch fixes the
assertion failure by not attempting any type changes if the type is
right already.
Fixes https://github.com/llvm/llvm-project/issues/189643
Assisted-by: claude in FileCheck CHECK lines fixing
NAS-140510 / 27.0.0-BETA.1 / fix creating containers (#18621)
All of the IDMAP related container API tests began to fail. These
started failing after commit 7711549fbd which made creating a container
fail if it can't set the hostname. That's entirely inappropriate since
setting a hostname on a container should be non-fatal.
Furthermore, there is no need to fork+exec to nsenter to set the
hostname. The init process of a container will read /etc/hostname and
/etc/hosts during boot.
With these changes, a container is created and the hostname is properly
set to the containers name and I can execute "sudo" commands without
warnings...
```
# hostname
calebtest123
# sudo true
# echo $?
0
[AMDGPU][Scheduler] Use MIR-level rematerializer in rematerialization stage
This makes the scheduler's rematerialization stage use the
target-independent rematerializer. Previosuly duplicate logic is
deleted, and restrictions are put in place in the stage so that the
same cosntraints as before apply on rematerializable registers (as the
rematerializer is able to expose many more rematerialization
opportunities than what the stage can track at the moment).
Consequently it is not expected that this change improves performance
overall, but it is a first step toward being able to use the
rematerializer's more advanced capabilities during scheduling.
This is *not* a NFC for 2 reasons.
- Score equalities between two rematerialization candidates with
otherwise equivalent score are decided by their corresponding
register's index handle in the rematerializer (previously the pointer
to their state object's value). This is determined by the
rematerializer's register collection order, which is different from
[10 lines not shown]
[DAG] Propagate OrZero and DemandedElts for min/max in isKnownToBeAPowerOfTwo (#182369)
Fixes #181643
For queries like `isKnownToBeAPowerOfTwo(V, OrZero=true)`, if an operand
is known to be "pow2-or-zero" but not strictly non-zero power-of-two,
the min/max case currently returns false even when the result remains
pow2-or-zero.
For instance:
- `A = select cond, 4, 0` (A is pow2-or-zero)
- `R = umin(A, 16)`
`R` is always in `{0, 4}` and querying `isKnownToBeAPowerOfTwo(R,
OrZero=true)` should be true.
Added unitests for baseline and failing case and now propagating
correctly to `OrZero` and `DemandedElts`
[X86] LowerShiftByScalarImmediate - vXi8 shl(X,2) - prefer PADDB+PADDB pair over PSLLW+PAND (#186095)
For all targets, (V)PADDB is always as fast as (V)PSLLW (usually faster)
- and usually as fast as (V)PAND, and avoids having to load a mask - so
for shift lefts by 2, a pair of (V)PADDB is a better choice vs
(V)PSLLW+(V)PAND
This is only necessary if we're avoiding a (V)PAND mask - otherwise we
just need a single (V)PSLLW.
go122: build w/ GOMAXPROCS=1
It frequently encounters a concurrency/gc related panic[1] on at
least earmv7hf. Could not reproduce with later versions of go.
[1] fatal error: workbuf is not empty
PR kern/60159 PR kern/55665 pg_jobc assertions unsafe
It has long been known that the way that the pg_jobc field in
struct pgrp works (or doesn't perhaps) is understood by no-one,
yet it does seem to function.
Its (sole) purpose is to determine whether a process group should
be subject to terminal generated stop signals (SIGTSTP, SIGTTIN, SIGTTOU).
That is, if there is no parent process of any of the jobs in the
process group, which is in a different process group than the one in
question (or it would also be affected by the signal) which can arrange
to send a SIGCONT to the process group to restart it, or some other
signal to terminate it, then the kernel must not stop that process
group, or it would (could) remained stopped, orphaned, forever (or
until some human notices and manually kills it, using some signal or
other).
However, the extremely convoluted calculations required to maintain
[26 lines not shown]