[CodeGen] Set pseudo probe desc comdat symbol to external for COFF (#176706)
lld-link performs COMDAT sections deduplication only when COMDAT symbol
is external.
[X86AsmBackend] Check fixup value overflow (#176827)
GNU Assembler has a generic error checking for overflowed fixup values
```
y.s:5: Error: value of 8000000000000000 too large for field of 4 bytes at 0000000000000004
```
In contrast, we have had an assertion that may fail for a long time.
https://reviews.llvm.org/D70652 improved the status by adding an
overflow check for PC-relative fixups, but missed other cases (#116899).
This patch improves the overflow check to resemble GAS.
For `.long x`, GAS accepts `x` if its value is in the range `(-2**32,
2**32)`. This design allows `.long x` to work regardless of signedness.
When a symbol is involved, GAS supports both `.long sym-0xffffffff` and
`.long sym+1`, as well as `.long sym+0xffffffff` and `.long sym-1`.
However,
`.long sym+0x100000000` is rejected in favor of `.long sym+0`.
[13 lines not shown]
[AMDGPU] si-peephole-sdwa: Handle V_PACK_B32_F16_e64 (WIP)
Change si-peephole-sdwa to eliminate V_PACK_B32_F16_e64 instructions
by changing the second operand to write to the upper word of the
destination directly.
[AMDGPU] Enable ISD::{FSIN,FCOS} custom lowering to work on v2f16 (#176382)
Currently ISD::FSIN and ISD::FCOS of type MVT::v2f16 are legalized by
first expanding and then using a custom lowering on the resulting f16
instructions. This ordering prevents using packed math variants of the
instructions introduced by the legalization (e.g. the multiplication) and
makes it difficult to deal with the resulting IR in peephole
optimizations (e.g. si-peephole-sdwa).
Change the legalization action for ISD::FSIN and ISD::FCOS of type
MTF::v2f16 to Custom and change the custom trig lowering to deal
with vectors.
[Offload][Tests] Non-contiguous_update_to_tests (#169623)
PR #144635 enabled non-contiguous updates for both `update from` and
`update to` clauses, but tests for `update to` were missing. This PR
adds those missing tests to ensure coverage.
[update_mc_test_checks] Support --show-inst output
This is useful to check that the correct registers were used in cases
where different register classes use the same name in asm input/output.
Pull Request: https://github.com/llvm/llvm-project/pull/174011
[NFCI][AMDGPU] Use X-macro to reduce boilerplate in `GCNSubtarget.h`
`GCNSubtarget.h` contained a large amount of repetitive code following the pattern `bool HasXXX = false;` for member declarations and `bool hasXXX() const { return HasXXX; }` for getters. This boilerplate made the file unnecessarily long and harder to maintain.
This patch introduces an X-macro pattern `GCN_SUBTARGET_HAS_FEATURE` that consolidates 129 simple subtarget features into a single list. The macro is expanded twice: once in the protected section to generate member variable declarations, and once in the public section to generate the corresponding getter methods. This reduces the file by approximately 265 lines while preserving the exact same API and functionality. Features with complex getter logic or inconsistent naming conventions are left as manual implementations for future improvement.
Ideally, these could be generated by TableGen using `GET_SUBTARGETINFO_MACRO`, similar to the X86 backend. However, `AMDGPU.td` has several issues that prevent direct adoption: duplicate field names (e.g., `DumpCode` is set by both `FeatureDumpCode` and `FeatureDumpCodeLower`), and inconsistent naming conventions where many features don't have the `Has` prefix (e.g., `FlatAddressSpace`, `GFX10Insts`, `FP64`). Fixing these issues would require renaming fields in `AMDGPU.td` and updating all references, which is left for future work.
[AMDGPU] Pre-commit test for WMMA NOP hoisting optimization (#176745)
Add test showing current behavior where V_NOP instructions for WMMA
coexecution hazards are inserted inside loop bodies at the use-site. A
future patch will hoist these NOPs to loop preheaders to reduce
redundant execution.
---------
Co-authored-by: Christudasan Devadasan <christudasan.devadasan at amd.com>
[Linalg] Support i1 data type in matchConvolutionOpOfType utility (#176704)
-- Extend bodyMatcherForConvolutionOps to recognize arith.ori/arith.andi
for i1 element types (in addition to add/mul for integer/float types)
for accumulation and multiplication.
-- Similarly, extend bodyMatcherForSumPoolOps to recognize arith.ori for
i1 accumulation (in addition to add for integer/float types).
Signed-off-by: Abhishek Varma <abhvarma at amd.com>
[clang][bytecode] Handle corner condition for sign negation (#176390)
RHS = -RHS works for most cases, however, the behaviour when RHS is
INTXX_MIN is undefined. In these particular case(s), we should use
INTXX_MAX instead.
Fixes #176271.
Attach stat contexts to tx and rx rings, and prepare for the bigger stat buffer size
used on newer hardware generations.
tested by stsp@ as part of a bigger diff
ok dlg@
lualoader: fix pruning of non-existent default kernel
Removing the kernel from the list of available kernels is sufficient to
avoid rendering it in the list, but we need more for booting to actually
work. Notably, the 'kernel' loader.conf var was left intact to its
default value, so if one didn't use the kernel selector in the menu then
we'd try to boot the nonexistent 'kernel' instead of the new default
(first autodetected).
There's room to improve the error messages here, but for now let's just
make it actually work correctly.
PR: 292232
Fixes: d04415c520b03 ("loader: lua: remove the default kernel [...]")
(cherry picked from commit e30086ab4c8778ea70a3b19e83546ce1b4a16492)
bectl: log modifying functions to zpool history
Modeled directly after the method used by the zfs/zpool commands: flag
commands with a "please log me" flag, and when there, reconstruct the
command line. On success, call the library function to add it to the
log.
(Majority of the change by Rob; minor edits by kevans@)
Signed-off-by: Rob Norris <rob.norris at klarasystems.com>
Co-authored-by: Kyle Evans <kevans at FreeBSD.org>
Sponsored by: Modirum MDPay
Sponsored by: Klara, Inc.
(cherry picked from commit 2a87929671e6e6919c18f2c25d60f2c73c3d18f4)
makedev(9): drop an additional note about cdevpriv dtors
These were previously somewhat safe to call destroy_dev(9), but will now
also cause a deadlock in the same fashion that d_close doing so would
previously. Amend the note to point it out, in case it's useful for
someone.
Reviewed by: imp, kib, markj
(cherry picked from commit 90314c04f10f583c37c59ec51fd628e3deaf3622)
libc: report _SC_NPROCESSORS_ONLN more accurately in cpu-limited jails
We don't support CPU hotplug, but we do support cpuset(8) restrictions
on jails (including prison0, which uses cpuset 1). The process cannot
widen its cpuset beyond its root set, so it makes sense to instead
report the number of cpus enabled there rather than the total number
in the system.
This change is effectively a nop for the majority of systems and jails
in the wild, though it does reduce the performance of this query now
that we can't take advantage of AT_NCPUS being provided in the auxinfo.
The implementation here is notably different than Linux, which would not
take cgroups into account. They do, however, take CPU hotplug into
account, so the possibility for it to diverge from (and be lower than)
the # configured count to reflect what the process can actually be
scheduled on doesn't really diverge in semantics.
Reviewed by: kib
[2 lines not shown]
libutil: defer setting the MAC label until after the login class
MAC policies, like mac_biba(4), may forbid changing the login class once
a label has been applied. For setting up the initial login context,
this isn't really expected and in-fact may break some class-based
configuration.
Defer setting the MAC label until after the login class is set, and
remove the requirement that we have a pwd entry since the label is
pulled from the login class -- we only use pwd for syslog in this path.
Patch is largely by Kevin Barry, with some modifications and this commit
message by kevans@.
PR: 177698
Reviewed by: des, olce
Co-authored-by: Kevin Barry <ta0kira gmail com>
(cherry picked from commit 98edcbcce0a4650084bd86e704cfa38bf590250c)
libc: fix description issues in mac_text(3)/mac_free(3)mac_text(3) as-written would seem to indicate that a `mac_t` should be
freed with free(3), but this isn't the case. One can derive from
context from when the change was introduced and COMPATIBILITY that this
was intended to talk about *text in `mac_to_text`, so move the comment
there.
PR: 179832
Co-authored-by: Priit Järv <priit cc ttu ee>
(cherry picked from commit 081218b7a2006e5b6783e51f66fd751871ac1272)
lualoader: fix pruning of non-existent default kernel
Removing the kernel from the list of available kernels is sufficient to
avoid rendering it in the list, but we need more for booting to actually
work. Notably, the 'kernel' loader.conf var was left intact to its
default value, so if one didn't use the kernel selector in the menu then
we'd try to boot the nonexistent 'kernel' instead of the new default
(first autodetected).
There's room to improve the error messages here, but for now let's just
make it actually work correctly.
PR: 292232
Fixes: d04415c520b03 ("loader: lua: remove the default kernel [...]")
(cherry picked from commit e30086ab4c8778ea70a3b19e83546ce1b4a16492)