[lldb] A few small code modernizations and cleanups [NFC] (#182656)
I was reading through ObjectContainerBSDArchive and came across some
dead method decls, a less-than-completely-clear `shared_ptr` typedef in
`ObjectContainerBSDArchive::Archive` for a shared_ptr<Archive> which was
a little unclear when reading a decl like `shared_ptr archive_sp;` for a
local variable.
libjail: avoid a double-free in the MAC label bits
As written, we'll repeatedly jps_free() the first element, which is
obviously bogus. Fix it to index appropriately.
Fixes: db3b39f063d9f ("libjail: extend struct handlers [...]")
[AArch64] Optimize 64-bit constant vector builds (#177076)
This patch optimizes the creation of constant 64-bit vectors (e.g.,
v2i32, v4i16) by avoiding expensive loads from the constant pool. The
optimization works by packing the constant vector elements into a single
i64 immediate and bitcasting the result to the target vector type. This
replaces a memory access with more efficient immediate materialization.
To ensure this transformation is efficient, a check is performed to
verify that the immediate can be generated in two or fewer mov
instructions. If it requires more, the compiler falls back to using the
constant pool.
The optimization is disabled for bigendian targets for now.
[lldb] Merge interfaces into lldbPluginScriptInterpreterPython (NFC) (#182962)
Make the interfaces part of lldbPluginScriptInterpreterPython instead of
putting them into their own static library. This avoids the need for an
extra static archive and more importantly a bunch of code duplication
between the two CMakeLists.txt.
17886 loader.efi: efi_redirect_exceptions does use uninitialized pointer
Reviewed by: Robert Mustacchi <rm+illumos at fingolfin.org>
Approved by: Gordon Ross <gordon.w.ross at gmail.com>
17885 loader.efi: free_tables() appears to free tss_pa twice.
Reviewed by: Robert Mustacchi <rm+illumos at fingolfin.org>
Approved by: Gordon Ross <gordon.w.ross at gmail.com>
17884 loader.efi: tss_pa setup seems to be flawed in trap.c
Reviewed by: Robert Mustacchi <rm+illumos at fingolfin.org>
Approved by: Gordon Ross <gordon.w.ross at gmail.com>
net-p2p/jackett: Update to 0.24.1127
Changelog: https://github.com/Jackett/Jackett/releases
PR: 293204
Reported by: Ralf van der Enden <tremere at cainites.net> (maintainer)
Approved by: Submitter is maintainer
[Clang][AMDGPU] Change __fp16 to _Float16 in GFX1250 CVT builtin definitions (#182893)
Change the type signature `gfx1250 cvt` builtins from `__fp16` to
`_Float16` in the tablegen builtin definitions.
[DWARFLinker] Use DIEEntry for backward ref_addr references (#181881)
The classic DWARF linker avoids `DIEEntry` for `DW_FORM_ref_addr`
references, using raw `DIEInteger` values with manual offset computation
instead. A stale FIXME explains this was because "the implementation
calls back to DwarfDebug to find the unit offset", but this is no longer
true. `DIEEntry` resolves offsets via
`DIEUnit::getDebugSectionOffset()`, which has no `DwarfDebug`
dependency.
And the real constraint is that forward references may point to
placeholder `DIEs` that never get adopted into a unit tree (due toODR
pruning), so `DIEEntry` cannot resolve them(a test failed during
refactoring this). However, backward references are safe, the target DIE
is already cloned and parented in a unit tree.
EC2 AMIs: Add .trim() to filtering script
The FreeBSD website uses HTML Tidy, which adds whitespace inside the
table of EC2 AMIs; I didn't notice this when I was testing locally
because it didn't run there. This results in the filtering breaking
since e.g. "ufs" does not match "\nufs\n".
Addding .trim() to the filtering script removes the extra whitespace
which HTML Tidy added.
PR: 293397
[NVPTX] Scalarize v2f32 instructions if input operand guarantees need for register coalescing (#180113)
The support of f32 packed instructions in #126337 revealed performance
regressions on certain kernels. In one case, the cause comes from
loading a v4f32 from shared memory but then accessing them as {r0, r2}
and {r1, r3} from the full load of {r0, r1, r2, r3}.
This access pattern guarantees the registers requires a coalescing
operation which increases register pressure and degrades performance.
The fix here is to identify if we can prove that an v2f32 operand comes
from non-contiguous vector extracts and if so scalarizes the operation
so the coalescing operation is no longer needed.
I've found that ptxas can see through the extra unpacks/repacks of
contiguous registers this causes in MIR. However in the full test case
the packing of the final scalar->vector results does generate additional
costs especially since the only users unpack them. An additional MIR
pass is possible to catch the case
[4 lines not shown]
heimdal: Pass the correct pointer to realloc when growing a string buffer
The realloc in my_fgetln was trying to grow the pointer to the string
buffer, not the string buffer itself.
In function 'my_fgetln',
inlined from 'mit_prop_dump' at crypto/heimdal/kdc/mit_dump.c:156:19:
crypto/heimdal/kdc/mit_dump.c:119:13: error: 'realloc' called on unallocated object 'line' [-Werror=free-nonheap-object]
119 | n = realloc(buf, *sz + (*sz >> 1));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
crypto/heimdal/kdc/mit_dump.c: In function 'mit_prop_dump':
crypto/heimdal/kdc/mit_dump.c:139:11: note: declared here
139 | char *line = NULL;
| ^~~~
Reviewed by: rmacklem, cy
Fixes: a93e1b731ae4 ("heimdal-kadmin: Add support for the -f dump option")
Differential Revision: https://reviews.freebsd.org/D54933
(cherry picked from commit 03d8ac948b1ad9c419b294c3129b7da58d818363)
heimdal: Pass the correct pointer to free in an error case
This fixes a warning reported by GCC 14 on stable/14:
crypto/heimdal/lib/hdb/keys.c:241:13: warning: 'free' called on pointer 'ext' with nonzero offset 16 [-Wfree-nonheap-object]
241 | free(hist_keys);
| ^~~~~~~~~~~~~~~
crypto/heimdal/lib/hdb/keys.c:234:15: note: returned from 'calloc'
234 | ext = calloc(1, sizeof (*ext));
| ^~~~~~~~~~~~~~~~~~~~~~~~
Reviewed by: rmacklem, cy
Fixes: 5000d023a446 ("heimdal-kadmin: Add support for the -f dump option")
Differential Revision: https://reviews.freebsd.org/D54932
(cherry picked from commit b26a7af438f36dcde86f39a681123cc2140affb2)
[CHR] Skip regions containing convergent calls (#180882)
CHR (Control Height Reduction) merges multiple biased branches into a
single speculative check, cloning the region into hot/cold paths. On
GPU targets, the merged branch may be divergent (evaluated per-thread),
splitting the wavefront: some threads take the hot path, others the
cold path.
A convergent call like ds_bpermute (a cross-lane operation on AMDGPU)
requires a specific set of threads to be active — when thread X reads
from thread Y, thread Y must be active and participating in the same
call. After CHR cloning, thread Y may have gone to the cold path while
thread X is on the hot path, so the hot-path ds_bpermute reads a stale
register value from thread Y instead of the intended value.
This caused a miscompilation in rocPRIM's lookback scan: CHR duplicated
a region containing ds_bpermute, and the hot-path copy executed with a
different set of active threads, reading incorrect cross-lane data and
causing a memory access fault.
[2 lines not shown]