[AMDGPU] Fix ShiftAmt32Imm to use unsigned comparison (#199052)
ShiftAmt32Imm used a signed 'Imm < 32' predicate, which incorrectly
matched negative immediates such as -1. The scalar fshr fast path:
def : GCNPat<(UniformTernaryFrag<fshr> i32:$src0, i32:$src1,
(i32 ShiftAmt32Imm:$src2)),
(i32 (EXTRACT_SUBREG (S_LSHR_B64 ..., $src2), sub0))>;
When fshl(scalar, X, Z) is lowered via expandFunnelShift for any
constant Z in [0, 31], the generic code converts it to fshr(..., ~Z) or
fshr(..., -Z), producing a negative shift amount. Because all such
values satisfy Imm < 32 in a signed comparison, ShiftAmt32Imm matched
and the pattern passed the negative immediate directly to S_LSHR_B64
without the S_AND_B32 masking. S_LSHR_B64 then shifted by the wrong
amount, producing an incorrect result.
Fix by changing the predicate to an unsigned comparison so that only
values in [0, 31] match, and negative values fall through to the general
[8 lines not shown]
[SystemZ] Don't fold memops after SSA if tied regs don't match. (#197475)
When foldMemoryOperandImpl() is called during register allocation,
folding into a reg/mem opcode mustn't be done if the tied def and use
operands do not end up referencing the same register.
Fixes #197414
[Hexagon] Fix up vector predicate before compressing it for bitcast (#199283)
In v64i1 vector Predicate, each i1 is represented by 2 bits of predicate
register. A predicate register needs to be fixed before we compress it.
Signed-off-by: Alexey Karyakin <akaryaki at qti.qualcomm.com>
Co-authored-by: Ikhlas Ajbar <iajbar at quicinc.com>
[AMDGPU] Refactor insertRelease into insertWriteback + insertWait (NFC) (#199486)
A release consists of two actions: write-back the current cache, and
wait for "relevant" outstanding operations to complete. With the new
memory model, it is possible to disable the cache write-back using
"non-av". This patch cleanly separates the existing implementation so
that the write-backs can be selectively applied after checking for
non-av semantics.
Part of a stack:
- #199486
- #199621
- #199489
- #199622
Assisted-By: Claude Opus 4.6
---------
Co-authored-by: Pierre van Houtryve <pierre.vanhoutryve at amd.com>
NAS-141173 / jbof: recover drive slots missing from fabric discovery
After hardwire_host succeeds, query the NVMe-oF discovery log from
each controller's configured fabric paths and compare to the BMC's
drive list. Any slot reported by the BMC as Enabled with an Endpoint,
but absent from at least one controller's discovery output, is
power-cycled via Drive.Reset (ForceOff then On) before attach_drives
runs.
HA-aware: the active controller queries discovery locally and via
failover.call_remote against the standby, unioning the missing-NQN
set so a slot only visible to the standby's view is still recovered.
Recovery actions run on the active only; Drive.Reset is global to the
slot so both controllers see the re-registered NQN on the next
attach_drives.
Adds four private methods:
- list_drive_slots_with_endpoints
- reset_drive_slot
[8 lines not shown]
[flang][OpenMP] Fix copyprivate crash with unlimited polymorphic pointer (#199768)
Lowering a copyprivate clause whose list item is an unlimited
polymorphic pointer (class(*), pointer) crashed in TypeInfo::typeScan.
The scan descends through the fir.class box and the fir.ptr, reaching a
`none` element type, which the terminal assertion did not allow.
Fixes #198770
[clang] fix getTemplateInstantiationArgs
This implements a new strategy for collecting the template arguments, by
relying on the qualifiers and template parameter lists to navigate the template
context of out-of-line definitions.
This greatly simplifies the signature of that function, by removing a bunch
of workarounds, and simpliffying a couple that weren't removed yet.
Since this now relies on qualifiers and template parameter lists,
this patch expends most of its effort making sure these are placed,
transformed and propagated to template instantiations.
Also makes the explicit specialization AST nodes stop abusing the template
parameter lists by storing it's own template parameter list, creating a
dedicated field for them, similar to partial specializations.
[lld][ELF] Exclude SHT_NOBITS sections from LMA overlap checks (#196423)
In embedded applications it's sometimes useful to load a section at the
same virtual address as the .bss section. For example, one possible use
case is for temporary code/data that is only needed for a short time
when the program is starting up:
REGIONS {
RAM : ORIGIN = 0x100000, LENGTH = 1M
INIT : ORIGIN = 0x200000, LENGTH = 1M
}
.text { *(.text); } > RAM
.bss (NOLOAD) : { *(.bss); } > RAM
.init : AT(LOADADDR(.bss)) { *(.init); } > INIT
The .init section gets placed in the file immediately after the .text
section. At startup the .init section contents are copied to the INIT
region before zeroing .bss. Once the .init section is no longer needed
[14 lines not shown]
NAS-141173 / jbof: retry hardwire_host with fabric-card reset on failure
When jbof.create or jbof.update cannot validate communication with the
expansion shelf on the first attempt, parallel-reset both IOMs' fabric
cards via the Oem NetworkAdapter.Reset Redfish action and retry
hardwire_host once.
Each reset returns when the BMC reports the restart complete (~28s
each, issued in parallel). A 30-second post-reset settle is added to
give the fabric data plane time to start answering ARP/NDP before the
retry validates again.
The retry path is gated on at least one initiator-side path having
failed to validate. On hosts that succeed on the first attempt, the
reset code is not reached and behavior is unchanged.
HA: the retry runs once on whichever controller invoked the create.
The fabric-card reset is global to the IOM, so a single reset benefits
both controllers' paths to that IOM.
[2 lines not shown]
[clang] fix finding class template instantiation pattern for member specializations (#199979)
Stop treating the member which a member specialization specializes as
the pattern of the former.
Split off from https://github.com/llvm/llvm-project/pull/199528
[flang][OpenMP] Support in_reduction on target
Teach Flang lowering and MLIR OpenMP translation to carry
in_reduction through omp.target.
The translation looks up the task reduction-private storage with
__kmpc_task_reduction_get_th_data and binds the target region's
in_reduction block argument to that private pointer, so uses inside the
region do not keep referring to the original variable.
The patch also preserves in_reduction operands in the TargetOp builder
path and makes sure target in_reduction list items are mapped into the
target region when needed.
import ports/devel/py-pydantic-settings, fixes/ok landry@
Pydantic Settings provides optional Pydantic features for loading a
settings or config class from environment variables or secrets files.
rpki-client: use sentinel idiom for timegm(3) error check
We currently fail on ASN.1 times before the epoch. There is nothing wrong
in principle with those. Both UTCTime and GeneralizedTimes can represent
such times and we should be able to accept them.
Modern OpenSSL and LibreSSL ensure in ASN1_TIME_to_tm() that the times are
well formed according to the DER, so this call is really only a translation
step.
ok claudio deraadt
Reapply "[clang][ssaf][NFC] Rework how the Force linker anchors are defined and used" (#194693)
This reverts commit 582958c4337f539e650096c0257a322315298e1a.
Drop "const" from these anchor variables - like they are in clang-tidy
Turns out, MSVC likely doesn't conform with the C++ standard and makes
`const volatile` global variables have *internal* linkage - while they
should have *external* linkage.
https://eel.is/c++draft/basic.link#3.2
```
(3) The name of an entity that belongs to a namespace scope has internal linkage if it is the name of
(3.1) a variable, variable template, function, or function template that is explicitly declared static; or
(3.2) a non-template variable of non-volatile const-qualified type, unless
(3.2.1) it is declared in the purview of a module interface unit (outside the private-module-fragment, if any) or module partition, or
(3.2.2) it is explicitly declared extern, or
(3.2.3) it is inline, or
(3.2.4) it was previously declared and the prior declaration did not have internal linkage; or
[3 lines not shown]