[AMDGPU] Limit allocation of lo128 registers for occupancy
Parent change allows allocation of lo128 VGPRs from all 4 banks.
That may result in the undesired allocation leaving a hole of
maximum 128 registers in case if for example v0-v127 are allocated,
and v128-v255 are free.
Limit the available allocation order to the occupancy. Both hard
occupancy limits and occupancy achieved during scheduling are
considered. That is better to spill a register than to drop occupancy
in this case.
[DTLTO] Fix handling of multi-module bitcode inputs (#174624)
This change fixes two issues when processing multi-module bitcode files
in DTLTO:
1. The DTLTO archive handling code incorrectly uses
getSingleBitcodeModule(), which asserts when the bitcode file contains
more than one module.
2. The temporary file containing the contents of an input archive member
was not emitted for multi-module bitcode files. This was due to
incorrect logic for recording whether a bitcode input contains any
ThinLTO modules. In a typical multi-module bitcode file, the first
module is a ThinLTO module while a subsequent auxiliary module is
non-ThinLTO. When modules are processed in order, the auxiliary module
causes the entire bitcode file to be classified as non-ThinLTO, and the
archive-member emission logic then incorrectly skips it.
In addition, this patch adds a test that verifies that multi-module
bitcode files can be successfully linked with DTLTO. The test reproduces
[2 lines not shown]
[CIR] Make cir.alloca alignment mandatory (#172663)
Fixed a crash in `CIRToLLVMAllocaOpLowering` where `cir.alloca`
operations without an explicit alignment attribute caused failures.
Modified the ODS definition of `cir.alloca` to use
`ConfinedAttr<I64Attr, [IntMinValue<0>]>`. This ensures the attribute is
always present.
Added a regression test in `clang/test/CIR/Lowering/alloca.cir`.
---------
Co-authored-by: Sirui Mu <msrlancern at gmail.com>
[MC/DC] Create dedicated MCDCCondBitmapAddr for each Decision (#125411)
MCDCCondBitmapAddr is moved from `CodeGenFunction` into `MCDCState` and
created for each Decision.
In `maybeCreateMCDCCondBitmap`, Allocate bitmaps for all valid Decisions
and emit them order by ID, to prevent nondeterminism.
devel/freebsd-git-devtools: Update to 2026-01-05 snapshot
Base commits since last update:
1c8dafe61887 - git-arc: Try to improve documentation
684c762485d3 - git-arc: Try to make patching more useful
Sponsored by: The FreeBSD Foundation
dns/dnsmasq: update to v2.92 + inotify patch
Changelog: https://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2026q1/018380.html
We keep the local patch to enable inotify on FreeBSD 15,
which was only merged after the release but had been in this port
already.
Make it so the pkg-message is printed on new installs and upgrades.
[X86] Lower scalar llvm.clmul intrinsics to PCLMULQDQ (#175189) (#175216)
Add support for lowering scalar llvm.clmul intrinsics (i8/i16/i32/i64)
to the PCLMULQDQ hardware instruction on X86 targets with the PCLMUL
feature, instead of using the default software expansion.
The lowering:
- Extends smaller types to the target's native width (i64 on x86-64, i32
on i686)
- Uses SCALAR_TO_VECTOR to create vectors (v2i64 on x86-64, v4i32 with
bitcast to v2i64 on i686)
- Performs X86ISD::PCLMULQDQ with immediate 0x00
- Extracts the result and truncates back to the original type
i8/i16/i32 CLMUL is enabled on both 32-bit and 64-bit targets. i64
CLMUL/CLMULH is only enabled on 64-bit targets.
Also adds ISD::CLMULH i64 support by extracting the upper element from
[2 lines not shown]
NAS-139304 / 26.04 / Convert ALLOWED_BUILTIN_GIDS to frozenset (#18023)
Correcting issues with #17894
The LocalAdminGroups included non-admin groups. Split those groups into
a separate enum class.
Renamed the class to more clearly indicate they are 'builtin' groups.
Changed the ALLOWED_BUILTIN_GIDS set to a frozenset and populate it with
the values of the new enum classes.
This passes all CI tests related to 'privilege' and manual targeted
testing.
Add additional lun health checks to standby_after_start
- Ensure that all expected IQNs and LUNs are present
- Ensure that SCST deems the LUN healthy to add to copy manager
Improve ALUA handling with locked or disabled extents
Previously iscsi.target.active_targets did not return any targets
where any LUNs were either disabled or locked. This prevented the
STANDBY node from offering these targets when ALUA was enabled.
Once this was rectified then improvements wrt LUN status change
were required for both disable/enable and lock/unlock.
To optimize handling of LUNs locking added an optional
do_reload parameter to iscsi.alua.removed_target_extent
Ensure standby_fix_cluster_mode does not run too soon
Ensure that we have reached a certain point in standby_after_start
before allowing standby_fix_cluster_mode to run.
For HA systems do not use systemd to start scst
Instead it will be started by vrrp_master and if ALUA is enabled
vrrp_backup. This allows finer control.
[mlir][Tensor] Add rank-reducing slice in generatedSlices (#174248)
When `replaceExtractSliceWithTiledProducer `creates a rank-reducing
slice to handle type mismatches, it should be tracked in
`generatedSlices `so downstream cleanup patterns (like IREE's
FoldExtractSliceOfBroadcast) can process it.
This PR also fixes an infinite loop in getUntiledProducerFromSliceSource
where adding the slice to generatedSlices caused the fusion worklist to
repeatedly try to re-fuse producers already inside the innermost loop;
the fix skips producers that are already inside the innermost loop via
an isProperAncestor check.
Added a lit test (@fuse_through_rank_reducing_slice) demonstrating
correct fusion through rank-reducing slices. Note that demonstrating the
generatedSlices tracking benefit requires a cleanup pattern
(SwapExtractSliceWithFillPatterns) to consume the slice; IREE's full CI
suite (iree-org/iree#23012) validates this works correctly in practice
with patterns like FoldExtractSliceOfBroadcast.
[3 lines not shown]
[LLDB][NativePDB] Introduce PdbAstBuilderClang (#175840)
This changes `PdbAstBuilder` to a language-neutral abstract interface
and moves all of its functionality to the `PdbAstBuilderClang` derived
class.
All Clang-specific methods with external callers are now public methods
on `PdbAstBuilderClang`. `TypeSystemClang` and `UdtRecordCompleter` use
`PdbAstBuilderClang` directly.
Did my best to clean up includes and unused methods.
RFC for context:
https://discourse.llvm.org/t/rfc-lldb-make-pdbastbuilder-language-agnostic/89117
[AMDGPU] Allow allocation of lo128 registers from all banks
We can encode 16-bit operands in a short form for VGPRs [0..127].
When we have 1K registers available we can in fact allocate 4
times more from all 4 banks. That, however, requires an allocatable
class for these operands. When for most of the instructions it will
result in the VOP3 longer form, for V_FMAAMK/FMADAK_F16 it will
simply prohibit the encoding because these do not have VOP3 forms.
A straight forward solution would be to create a register class
with all registers having bit 8 of the encoding zero, i.e. to
create a register class with holes punched in it: [0-127, 256-383,
512-639, 768-895]. LLVM, however, does not like register classes
with punched holes when they also have subregisters. The cross-
product of all classes explodes and some combinations of a 'class
having a common subreg with another' becomeing impossible. Just
doing so explodes our register info to 4+Gb, uncompilable too.
The solution proposed is to define _lo128 RC with contigous 896
[17 lines not shown]
NAS-139317 / 25.10.2 / Call ha_panic for ix-reboot.service as we do for ix-shutdown (by bmeagherix) (#18029)
Recently in PR #17833 the `ha_panic` script was added and called on
`ExecStop` in `ix-shutdown.service`.
This PR makes a similar change to `ix-reboot.service` for the same
rationale.
Original PR: https://github.com/truenas/middleware/pull/18028
Co-authored-by: Brian M <brian.meagher at ixsystems.com>