sched_ule: Fix off by one in preempt_thresh definition
Since 'preempt_thresh' is set to PRI_MIN_KERN by default, and comparison
of the considered thread's priority with that threshold is done with
'<=', PRI_MIN_KERN threads actually can preempt other threads, contrary
to other non-interrupt kernel ones (between PRI_MIN_KERN + 1 and
PRI_MAX_KERN).
So, replace the comparison operator '<=' by '<'. The alternative would
be to change the default value, but changing the comparison instead has
the benefit to be consistent with the 0 setting (which forbids
preemption entirely), since allowing only threads with priority 0 to
preempt becomes possible.
Consequently, we also change the default value for the FULL_PREEMPTION
option by adding 1 to PRI_MAX_IDLE (in practice, that does not make any
difference in the current setting, since no preemption will happen if
the new priority value is not strictly lower than the current one, and
PRI_MAX_IDLE is PRI_MAX, the highest possible priority).
[8 lines not shown]
nfs_commonkrpc.c: Improve handling of NFSv4.1/4.2 recovery
Commit 4d80d4913e79 fixed a long standing bug in the recovery
code. However. glebius@ reported seeing multiple
recovery cycles with this patch during an NFSv4.1/4.2
server reboot.
This commit should minimize the risk of multiple
recovery cycles.
PR: 294925
(cherry picked from commit ea4886f2829bf33866c8c0c60b14a9641fc54b40)
nfs_commonkrpc.c: Improve handling of NFSv4.1/4.2 recovery
Commit 4d80d4913e79 fixed a long standing bug in the recovery
code. However. glebius@ reported seeing multiple
recovery cycles with this patch during an NFSv4.1/4.2
server reboot.
This commit should minimize the risk of multiple
recovery cycles.
PR: 294925
(cherry picked from commit ea4886f2829bf33866c8c0c60b14a9641fc54b40)
[X86][APX] Implement push+push2+push pre-alignment strategy for PP2 (#205031)
Replace the dummy "push %rax" stack-alignment padding for APX push2/pop2
(PP2) with a push+push2+push strategy: when an even number of callee-saved
GPRs is involved, a single CSR push provides the 16-byte alignment instead
of a throwaway push %rax, and the remaining registers use push2/pop2. The
padForPush2Pop2 flag and its associated dummy push, SUB/LEA padding, and
SEH_StackAlloc emission in spill/restoreCalleeSavedRegisters are removed.
BuildStackAdjustment now uses NF (no-flags) variants of ADD/SUB, but
only as a smaller replacement for LEA, i.e. only when EFLAGS must be preserved
across the adjustment. When EFLAGS is dead the plain SUB/ADD is kept, which is
shorter than the EVEX-encoded NF form. The NF opcodes are 64-bit
(SUB64ri32_NF/ADD64ri32_NF), so they are not used for the x32 ABI, and
they are recognized in mergeSPUpdates and the epilogue backward scan.
Update LIT tests accordingly.
Assisted-by: Claude Opus 4.8 (1M context) <noreply at anthropic.com>
nfs: Fix argument typo to avoid a crash
A typo resulted in the wrong argument for a bytewise
comparison that could result in a crash if
the incorrect argument was not a valid pointer.
This patch fixes the argument.
While investigating this, I noticed that the
correct argument was not being filled in as
required, so this patch fixes that, as well.
Somehow, recovery from a NFSv4.1/4.2 server
crash worked during testing, so this was not
detected. The bug/patch only affects NFS
client mounts using NFSv4.1/4.2.
PR: 294925
(cherry picked from commit 4d80d4913e79c8b5918b1f04c1c7b38e6c76b9b4)
nfs: Fix argument typo to avoid a crash
A typo resulted in the wrong argument for a bytewise
comparison that could result in a crash if
the incorrect argument was not a valid pointer.
This patch fixes the argument.
While investigating this, I noticed that the
correct argument was not being filled in as
required, so this patch fixes that, as well.
Somehow, recovery from a NFSv4.1/4.2 server
crash worked during testing, so this was not
detected. The bug/patch only affects NFS
client mounts using NFSv4.1/4.2.
PR: 294925
(cherry picked from commit 4d80d4913e79c8b5918b1f04c1c7b38e6c76b9b4)
[VectorCombine] Fold zero tests of or/umax reductions (#205622)
Recognize equality and inequality tests against zero on vector.reduce.or
and vector.reduce.umax. When profitable, replace the scalar reduction
and
compare with a lane-wise comparison followed by an i1 reduce.or or
reduce.and.
Run the existing zero-preserving reduction fold first to retain its more
specific canonicalization opportunities.
Proof: https://alive2.llvm.org/ce/z/pyoTwP
Fixed https://github.com/llvm/llvm-project/issues/205028
[Instrumentor] Add runtime examples: [1/N] A flop counter (#205698)
This adds a instrumentor-tools folder into compiler RT to showcase use
cases of the instrumentor. The initial example is a program that, via
instrumentation, counts the number of flops performed. Call and
intrinsic support will follow after #198042.
This is the second try with more CMake magic after
https://github.com/llvm/llvm-project/pull/205221 failed on some
platforms.
Partially developped by Claude (AI), tested and verified by me.
build.7: explain how to build KBI-compatible standalone module
Reviewed by: imp, kevans
Sponsored by: The FreeBSD Foundation
MFC after: 3 days
Differential revision: https://reviews.freebsd.org/D57859
vmd(8): prevent OOB reads in 32 and 64-bit ELF loaders.
Malformed ELF files could cause reading past section-headers.
For ELF64 files, malformed section metadata could cause out of bound
reads of heap allocated buffers.
Reported by Frank Denis.
Discussed with and "go for it" from mlarkin@
[flang][openacc] add acc.routine op for external names added in bind clauses. (#205591)
This adds acc.routine ops for the func.func ops that declare external
functions bound for device specific. This is needed to get the
ACCRoutineToGPUFunc pass to move the function declaration into the
correct region.
This is a follow-up from
[#203088](https://github.com/llvm/llvm-project/pull/203088) which
unblocked the original pass that was stalling bind clauses, but failed
further down the pipeline.
[CIR] Implement Direct+canFlatten in CallConvLowering
ArgKind::Direct with a multi-field coerced struct and the canFlatten flag
means the coerced struct is passed as one scalar wire argument per field.
CallConvLowering was passing it as a single aggregate, ignoring canFlatten.
A new getFlattenedCoercedType helper recognizes the Direct+canFlatten arg
shape. At the callee, insertArgCoercion replaces the single block argument
with N scalar block args, stores each into an alloca of the coerced struct
type, reloads it, and coerces back to the original argument type when the
coerced struct type differs from the original. The Ignore-drop loop and
updateArgAttrs account for the N block-argument slots a flattened arg
occupies; updateArgAttrs also shapes them on the sret return path.
At the call site, when the operand type differs from the coerced struct
type the operand is coerced through a memory slot and each field is read
from that slot with cir.get_member + cir.load (via a new emitCoercionToMemory
helper that returns the coerce-slot pointer without loading the whole
aggregate); when the types already match each field is extracted directly
[7 lines not shown]
Revert "[libc++] P3798R1: The unexpected in std::expected (#204826)" (#205597)
Reverts 45a65bb48b5925707f43d08e30df2263a5e4e268.
Currently, there is no consensus among LWG and standard library
maintainers that P3798R1 should be applied as a Defect Report. So it is
better to revert the paper application for now and then reapply it as an
addition in C++29 when C++29 mode is ready.
[llvm][GVNSink] Avoid non-determistic iteration order over NeededPHIs
The iteration order of DenseSet is not guaranteed, which affects the
output of code generated with GVNSink enabled. This can cause code to be
emitted in differing order, affect section ordering and in some cases
was reported to sometimes result in larger binaries due to increased
padding between sections.
This patch addresses this by using SetVector, which has a deterministic
iteration order.
[CIR] Implement Direct+canFlatten in CallConvLowering
ArgKind::Direct with a multi-field coerced struct and the canFlatten flag
means the coerced struct is passed as one scalar wire argument per field.
CallConvLowering was passing it as a single aggregate, ignoring canFlatten.
A new getFlattenedCoercedType helper recognizes the Direct+canFlatten arg
shape. At the callee, insertArgCoercion replaces the single block argument
with N scalar block args, stores each into an alloca of the coerced struct
type, reloads it, and coerces back to the original argument type when the
coerced struct type differs from the original. The Ignore-drop loop and
updateArgAttrs account for the N block-argument slots a flattened arg
occupies; updateArgAttrs also shapes them on the sret return path.
At the call site, when the operand type differs from the coerced struct
type the operand is coerced through a memory slot and each field is read
from that slot with cir.get_member + cir.load (via a new emitCoercionToMemory
helper that returns the coerce-slot pointer without loading the whole
aggregate); when the types already match each field is extracted directly
[5 lines not shown]
[flang][semantics][OpenACC] Warn for DEFAULT(NONE) scalars by default (#205683)
Change OpenACC `DEFAULT(NONE)` scalar handling to use the
pre-OpenACC-3.2 scalar behavior by default while emitting a warning.
Scalars referenced in a `default(none)` compute region without an
explicit data clause now warn by default instead of erroring. Arrays and
other non-scalars still error under `default(none)`.
Users can opt into OpenACC 3.2 strict scalar behavior with:
`-fopenacc-default-none-scalars-strict` and the default scalar warning
can be suppressed with: `-Wno-openacc-default-none-scalars-strict`
Add nicer dark and light colour sets (themes) used on terminals with 256
or more colours. Currently based these on emacs but they could change.
Terminals with fewer colours use the ANSI colours. A new "theme" option
overrides the detected theme (set to "terminal" to go back to ANSI
colours).
libusb: don't treat EINVAL from USB_FS_COMPLETE as device detach
ugen20_process() treats any non-EBUSY errno returned by USB_FS_COMPLETE
as device detach and returns LIBUSB20_ERROR_OTHER. This causes libusb10
to set device_is_gone and fail all subsequent transfer with
LIBUSB_ERROR_NO_DEVICE.
However, USB_FS_COMPLETE can also return EINVAL when a completion
references an endpoint that no longer exists, for example after
SET_INTERFACE or SET_CONFIG removes and recreates endpoints. This is a
transient condition and does not indicate device detach.
Treat EINVAL the same as EBUSY and stop draining completions. This
prevents a guest selecting an isochronous streaming altsetting from
permanently breaking the passed-through device.
Reviewed by: bapt
Event: Halifax Hackathon 202606
Location: Peggy's Cove Rock
[2 lines not shown]
[CIR] Wire const goto labels into indirect branch (#201644)
A computed goto through a constant dispatch table -- the GNU static
dispatch-table idiom `static const void *tbl[] = {&&L1, &&L2}; goto *tbl[i];`
-- reached `errorNYI("Indirect goto without a goto block")` in
`emitIndirectGotoStmt`. #203644 emits the label-address constant (the
value-like `#cir.block_addr_info`) into the table, but it takes a label's
address in a constant context without registering the label as address-taken,
so no indirect-goto block exists for the following `goto *tbl[i]` to branch to.
(#203644 landed the constant attribute, its lowering, and the GotoSolver label
retention; this is the remaining dispatch wiring.)
`VisitAddrLabelExpr` in the constant emitter now records each label via
`takeAddressOfConstantLabel`, which instantiates the indirect-goto block and
tracks the label; `finishIndirectBranch` then adds those labels as
`cir.indirect_br` successors alongside the existing op-form labels. A label
named more than once in a table is kept as a distinct successor each time, to
match classic codegen.
[8 lines not shown]
Revert "[Clang] Optionally use NewPM to run CodeGen Pipeline" (#205943)
Reverts llvm/llvm-project#205928
Is missing dependencies in a shared libraries build. Will investigate
offline.
[SLP]Fix crash erasing reduced value extract still used by reduction
A reduced value vectorized in an operand subtree is replaced by an
extractelement that can be excluded from another reduction group's
candidates as incompatible, yet it is still consumed by the final
reduction. Keep such excluded extracts externally used so they are not
erased while vectorizing that group.
Fixes #205886
Reviewers:
Pull Request: https://github.com/llvm/llvm-project/pull/205942