Linux/linux adc1e5carch/x86/platform/efi quirks.c, drivers/firmware/efi/libstub relocate.c mem.c

Merge tag 'efi-fixes-for-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi

Pull EFI fixes from Ard Biesheuvel:

 - Fix issues in EFI graceful recovery on x86 introduced by changes to
   the kernel mode FPU APIs

 - I-cache coherency fixes for the LoongArch EFI stub

 - Locking fix for EFI pstore

 - Code tweak for efivarfs

* tag 'efi-fixes-for-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
  x86/efi: Restore IRQ state in EFI page fault handler
  x86/efi: Fix graceful fault handling after FPU softirq changes
  efi/libstub: Synchronize instruction cache after kernel relocation
  efi/loongarch: Implement efi_cache_sync_image()
  efi/libstub: Move efi_relocate_kernel() into its only remaining user

    [2 lines not shown]
DeltaFile
+0-166drivers/firmware/efi/libstub/relocate.c
+82-0drivers/firmware/efi/libstub/mem.c
+80-0drivers/firmware/efi/libstub/loongarch-stub.c
+11-2arch/x86/platform/efi/quirks.c
+0-7drivers/firmware/efi/libstub/efistub.h
+1-4fs/efivarfs/super.c
+174-1795 files not shown
+186-18311 files

Linux/linux e809480arch/loongarch/include/asm asm-prototypes.h, arch/loongarch/include/asm/vdso gettimeofday.h

Merge tag 'loongarch-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson

Pull LoongArch fixes from Huacai Chen:
 "Fix some build and runtime issues after 32BIT Kconfig option enabled,
  improve the platform-specific PCI controller compatibility, drop
  custom __arch_vdso_hres_capable(), and fix a lot of KVM bugs"

* tag 'loongarch-fixes-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
  LoongArch: KVM: Move unconditional delay into timer clear scenery
  LoongArch: KVM: Fix HW timer interrupt lost when inject interrupt by software
  LoongArch: KVM: Move AVEC interrupt injection into switch loop
  LoongArch: KVM: Use kvm_set_pte() in kvm_flush_pte()
  LoongArch: KVM: Fix missing EMULATE_FAIL in kvm_emu_mmio_read()
  LoongArch: KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS
  LoongArch: KVM: Fix "unreliable stack" for kvm_exc_entry
  LoongArch: KVM: Compile switch.S directly into the kernel
  LoongArch: vDSO: Drop custom __arch_vdso_hres_capable()
  LoongArch: Fix potential ADE in loongson_gpu_fixup_dma_hang()
  LoongArch: Use per-root-bridge PCIH flag to skip mem resource fixup

    [3 lines not shown]
DeltaFile
+3-32arch/loongarch/kvm/main.c
+23-11arch/loongarch/kvm/interrupt.c
+16-6arch/loongarch/kvm/switch.S
+20-0arch/loongarch/include/asm/asm-prototypes.h
+8-2arch/loongarch/kvm/timer.c
+0-6arch/loongarch/include/asm/vdso/gettimeofday.h
+70-5712 files not shown
+90-6518 files

Linux/linux 74fe02cinclude/linux workqueue.h, kernel workqueue.c

Merge tag 'wq-for-7.1-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq

Pull workqueue fixes from Tejun Heo:

 - Fix devm_alloc_workqueue() passing a va_list as a positional arg to
   the variadic alloc_workqueue() macro, which garbled wq->name and
   skipped lockdep init on the devm path. Fold both noprof entry points
   onto a va_list helper.

   Also, annotate it using __printf(1, 0)

* tag 'wq-for-7.1-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Annotate alloc_workqueue_va() with __printf(1, 0)
  workqueue: fix devm_alloc_workqueue() va_list misuse
DeltaFile
+20-9kernel/workqueue.c
+4-2include/linux/workqueue.h
+24-112 files

Linux/linux 11f0007Documentation/admin-guide/cgroup-v1 memcg_test.rst, include/linux cgroup-defs.h

Merge tag 'cgroup-for-7.1-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

Pull cgroup fixes from Tejun Heo:

 - During v6.19, cgroup task unlink was moved from do_exit() to after the
   final task switch to satisfy a controller invariant. That left the kernel
   seeing tasks past exit_signals() longer than userspace expected, and
   several v7.0 follow-ups tried to bridge the gap by making rmdir wait for
   the kernel side. None held up.

   The latest is an A-A deadlock when rmdir is invoked by the reaper of
   zombies whose pidns teardown the rmdir itself is waiting on, which
   points at the synchronizing approach being fundamentally wrong.

   Take a different approach: drop the wait, leave rmdir's user-visible
   side returning as soon as cgroup.procs is empty, and defer the css
   percpu_ref kill that drives ->css_offline() until the cgroup is fully
   depopulated.


    [16 lines not shown]
DeltaFile
+114-130kernel/cgroup/cgroup.c
+2-4Documentation/admin-guide/cgroup-v1/memcg_test.rst
+2-2include/linux/cgroup-defs.h
+118-1363 files

Linux/linux de95ad9include/linux cgroup.h, include/linux/sched ext.h

Merge tag 'sched_ext-for-7.1-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext

Pull sched_ext fixes from Tejun Heo:

 - Fix idle CPU selection returning prev_cpu outside the task's cpus_ptr
   when the BPF caller's allowed mask was wider. Stable backport.

 - Two opposite-direction gaps in scx_task_iter's cgroup-scoped mode
   versus the global mode:

    - Tasks past exit_signals() are filtered by the cgroup walk but kept
      by global. Sub-scheduler enable abort leaked __scx_init_task()
      state. Add a CSS_TASK_ITER_WITH_DEAD flag to cgroup's task
      iterator (scx_task_iter is its only user) and use it.

    - Tasks past sched_ext_dead() are still returned, tripping
      WARN_ON_ONCE() in callers or making them touch torn-down state.
      Mark and skip under the per-task rq lock.


    [4 lines not shown]
DeltaFile
+30-11kernel/sched/ext.c
+6-6kernel/sched/ext_idle.c
+5-3kernel/cgroup/cgroup.c
+1-0include/linux/cgroup.h
+1-0include/linux/sched/ext.h
+43-205 files

Linux/linux 50fb0bcdrivers/scsi/device_handler scsi_dh_alua.c, drivers/scsi/hisi_sas hisi_sas_v3_hw.c

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "All in drivers.

  The largest change is the ufs one which has to introduce a new
  function to check the power state before doing the update and the most
  widely encountered one is the obvious change to sg to not use
  GFP_ATOMIC"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: target: iscsi: reject invalid size Extended CDB AHS
  scsi: ufs: core: Fix bRefClkFreq write failure in HS-LSS mode
  scsi: hisi_sas: Fix sparse warnings in prep_ata_v3_hw()
  scsi: pmcraid: Fix typo in comments
  scsi: scsi_dh_alua: Increase default ALUA timeout to maximum spec value
  scsi: smartpqi: Silence a recursive lock warning
  scsi: mpt3sas: Limit NVMe request size to 2 MiB
  scsi: sg: Don't use GFP_ATOMIC in sg_start_req()
  scsi: target: configfs: Bound snprintf() return in tg_pt_gp_members_show()
DeltaFile
+28-2drivers/ufs/core/ufshcd.c
+18-4drivers/target/iscsi/iscsi_target.c
+13-1drivers/scsi/mpt3sas/mpt3sas_scsih.c
+5-0include/ufs/unipro.h
+1-1drivers/scsi/hisi_sas/hisi_sas_v3_hw.c
+1-1drivers/scsi/device_handler/scsi_dh_alua.c
+66-94 files not shown
+70-1210 files

Linux/linux 13ad98edrivers/video/fbdev udlfb.c, drivers/video/fbdev/core fb_defio.c

Merge tag 'fbdev-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev

Pull fbdev fixes from Helge Deller:
 "Four small patches for fbdev, of which two are important: One fixes
  the bitmap font generation and the other prevents a possible
  use-after-free in udlfb:

   - Fix rotating fonts by 180 degrees (Thomas Zimmermann)

   - Drop duplicate include of linux/module.h in fb_defio (Chen Ni)

   - Add vm_ops in udlfb to prevent use-after-free (Rajat Gupta)

   - ipu-v3: clean up kernel-doc warnings (Randy Dunlap)"

* tag 'fbdev-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/linux-fbdev:
  fbdev: udlfb: add vm_ops to dlfb_ops_mmap to prevent use-after-free
  lib/fonts: Fix bit position when rotating by 180 degrees
  fbdev: defio: Remove duplicate include of linux/module.h
  fbdev: ipu-v3: clean up kernel-doc warnings
DeltaFile
+30-1drivers/video/fbdev/udlfb.c
+11-5include/video/imx-ipu-image-convert.h
+1-1lib/fonts/font_rotate.c
+1-0include/video/udlfb.h
+0-1drivers/video/fbdev/core/fb_defio.c
+43-85 files

Linux/linux 9207d47drivers/infiniband/hw/hns hns_roce_qp.c hns_roce_srq.c, drivers/infiniband/hw/mana qp.c

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:

 - Several error unwind misses on system calls in mlx5, mana, ocrdma,
   vmw_pvrdma, mlx4, and hns

 - More rxe bugs processing network packets

 - User triggerable races in mlx5 when destroying and creating the same
   same object when the FW returns the same object ID

 - Incorrect passing of an IPv6 address through netlink
   RDMA_NL_LS_OP_IP_RESOLVE

 - Add memory ordering for mlx5's lock avoidance pattenr

 - Protect mana from kernel memory overflow


    [24 lines not shown]
DeltaFile
+10-6drivers/infiniband/hw/mana/qp.c
+13-1drivers/infiniband/sw/rxe/rxe_resp.c
+7-6drivers/net/ethernet/mellanox/mlx4/srq.c
+10-3drivers/infiniband/hw/hns/hns_roce_qp.c
+6-6drivers/infiniband/hw/hns/hns_roce_srq.c
+11-0drivers/infiniband/sw/rxe/rxe_recv.c
+57-2214 files not shown
+113-4520 files

Linux/linux 4e38654drivers/media/platform/qcom/camss camss.c camss-csiphy.c, drivers/media/platform/qcom/iris iris_vpu4x.c iris_vpu_common.c

Merge tag 'media/v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - rc: ttusbir: fix inverted error logic

 - Venus/Iris fixes:
      - Kconfig cross compile build testing for x86
      - Use-after-free fix for internal buffers
      - dma_free_attrs size fix
      - Switch to hardware mode clocks
      - Use-after-free fix for a concurrency path
      - Fix H265D_MAX_SLICE size for sc7280 devices

 - camoss: fix some clock-related issues

* tag 'media/v7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: qcom: camss: avoid format string warning
  media: qcom: camss: Add missing clocks for VFE lite on sa8775p

    [9 lines not shown]
DeltaFile
+40-40drivers/media/platform/qcom/camss/camss.c
+13-11drivers/media/platform/qcom/iris/iris_vpu4x.c
+10-6drivers/media/platform/qcom/iris/iris_vpu_common.c
+7-3drivers/media/platform/qcom/camss/camss-csiphy.c
+3-6drivers/media/platform/qcom/iris/iris_vpu3x.c
+5-3drivers/media/platform/qcom/iris/iris_buffer.c
+78-6914 files not shown
+99-9420 files

Linux/linux 2c340aaarch/x86/include/asm efi.h, arch/x86/mm fault.c

x86/efi: Restore IRQ state in EFI page fault handler

The kernel's softirq API does not permit re-enabling softirqs while IRQs
are disabled. The reason for this is that local_bh_enable() will not
only re-enable delivery of softirqs over the back of IRQs, it will also
handle any pending softirqs immediately, regardless of whether IRQs are
enabled at that point.

For this reason, commit

  d02198550423 ("x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs")

disables softirqs only when IRQs are enabled, as it is not permitted
otherwise, but also unnecessary, given that asynchronous softirq
delivery never happens to begin with while IRQs are disabled.

However, this does mean that entering a kernel mode FPU section with
IRQs enabled and leaving it with IRQs disabled leads to problems, as
identified by Sashiko [0]: the EFI page fault handler is called from

    [17 lines not shown]
DeltaFile
+10-1arch/x86/platform/efi/quirks.c
+2-1arch/x86/include/asm/efi.h
+1-1arch/x86/mm/fault.c
+13-33 files

Linux/linux a293ec2tools/testing/selftests kselftest_harness.h kselftest.h

Merge tag 'linux_kselftest-fixes-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:

 - Fix extra test number increment in ksft_exit_skip() that results in
   incorrect KTAP result

 - Fix regression introduced by addition of explicit constructor orders
   for fixture tests. This addition broke the ordering of those relative
   to non-fixture tests and the reverse-constructor-order detection

* tag 'linux_kselftest-fixes-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests: harness: Restore order of test functions
  selftests: kselftest: fix wrong test number in ksft_exit_skip
DeltaFile
+6-6tools/testing/selftests/kselftest_harness.h
+1-1tools/testing/selftests/kselftest.h
+7-72 files

Linux/linux d876954Documentation/admin-guide/cgroup-v1 memcg_test.rst

docs: cgroup-v1: Update charge-commit section

Commit 1d8f136a421f ("memcg/hugetlb: remove memcg hugetlb
try-commit-cancel protocol") removed mem_cgroup_commit_charge() and
mem_cgroup_cancel_charge(), but the docs still refer to those functions.
There is no longer any charge cancellation.

Update the docs to match the code.

Signed-off-by: T.J. Mercier <tjmercier at google.com>
Signed-off-by: Tejun Heo <tj at kernel.org>
DeltaFile
+2-4Documentation/admin-guide/cgroup-v1/memcg_test.rst
+2-41 files

Linux/linux b34c827kernel/sched ext_idle.c

sched_ext: idle: Recheck prev_cpu after narrowing allowed mask

scx_select_cpu_dfl() narrows @allowed to @cpus_allowed & @p->cpus_ptr
when the BPF caller supplies a @cpus_allowed that differs from
@p->cpus_ptr and @p doesn't have full affinity. However,
@is_prev_allowed was computed against the original (wider)
@cpus_allowed, so the prev_cpu fast paths could pick a @prev_cpu that
is in @cpus_allowed but not in @p->cpus_ptr, violating the intended
invariant that the returned CPU is always usable by @p. The kernel
masks this via the SCX_EV_SELECT_CPU_FALLBACK fallback, but the
behavior contradicts the documented contract.

Move the @is_prev_allowed evaluation past the narrowing block so it
tests against the final @allowed mask.

Fixes: ee9a4e92799d ("sched_ext: idle: Properly handle invalid prev_cpu during idle selection")
Cc: stable at vger.kernel.org # v6.16+
Assisted-by: Claude <noreply at anthropic.com>
Signed-off-by: David Carlier <devnexen at gmail.com>

    [2 lines not shown]
DeltaFile
+6-6kernel/sched/ext_idle.c
+6-61 files

Linux/linux c7e4e4ddrivers/char/ipmi ipmi_si_intf.c ipmi_ssif.c

Merge tag 'for-linus-7.1-2' of https://github.com/cminyard/linux-ipmi

Pull IPMI fixes from Corey Minyard:
 "Fix a number of issues that came up recently

  The first two fixes are workarounds for buggy IPMI hardware. The
  hardware says it has data for the IPMI driver to read constantly, so
  the driver reads the data constantly, causing any new requests to be
  blocked.

  The first fix was to check for invalid data right when the data was
  read from the device and stop the operation there (there was a later
  check for invalid data, but it could not stop the operation at that
  point). It turned out the device was providing good data, so that
  didn't fix the issue, but it's still a good check.

  The second fix stops fetching this data after a few fetches and allows
  other operations to occur. The driver won't work very well, but at
  least it won't wedge. This seems to fix the issue.

    [13 lines not shown]
DeltaFile
+56-14drivers/char/ipmi/ipmi_si_intf.c
+22-2drivers/char/ipmi/ipmi_ssif.c
+78-162 files

Linux/linux ff9eda4include/linux/sched ext.h, kernel/sched ext.c

sched_ext: Skip past-sched_ext_dead() tasks in scx_task_iter_next_locked()

scx_task_iter's cgroup-scoped mode can return tasks whose
sched_ext_dead() has already completed: cgroup_task_dead() removes
from cset->tasks after sched_ext_dead() in finish_task_switch() and is
irq-work deferred on PREEMPT_RT. The global mode is fine -
sched_ext_dead() removes from scx_tasks via list_del_init() first.

Callers (sub-sched enable prep/abort/apply, scx_sub_disable(),
scx_fail_parent()) assume returned tasks are still on @sch and trip
WARN_ON_ONCE() or operate on torn-down state otherwise.

Set %SCX_TASK_OFF_TASKS in sched_ext_dead() under @p's rq lock and
have scx_task_iter_next_locked() skip flagged tasks under the same
lock. Setter and reader serialize on the per-task rq lock - no race.

Signed-off-by: Tejun Heo <tj at kernel.org>
DeltaFile
+26-9kernel/sched/ext.c
+1-0include/linux/sched/ext.h
+27-92 files

Linux/linux 60f21a2include/linux cgroup.h, kernel/cgroup cgroup.c

cgroup, sched_ext: Include exiting tasks in cgroup iter

a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup") made
css_task_iter_advance() skip exiting tasks so cgroup.procs stays consistent
with waitpid() visibility. Unfortunately, this broke scx_task_iter.

scx_task_iter walks either scx_tasks (global) or a cgroup subtree via
css_task_iter() and the two modes are expected to cover the same set of
tasks. After the above change the cgroup-scoped mode silently skips tasks
past exit_signals() that are still on scx_tasks.

scx_sub_enable_workfn()'s abort path is one of the symptoms: an exiting
SCX_TASK_SUB_INIT task can race past the cgroup iter leaking
__scx_init_task() state. Other iterations share the same gap.

Add CSS_TASK_ITER_WITH_DEAD to opt out of the skip and use it from
scx_task_iter().

Fixes: b0e4c2f8a0f0 ("sched_ext: Implement cgroup subtree iteration for scx_task_iter")

    [2 lines not shown]
DeltaFile
+5-3kernel/cgroup/cgroup.c
+4-2kernel/sched/ext.c
+1-0include/linux/cgroup.h
+10-53 files

Linux/linux 93618edinclude/linux cgroup-defs.h, kernel/cgroup cgroup.c

cgroup: Defer css percpu_ref kill on rmdir until cgroup is depopulated

A chain of commits going back to v7.0 reworked rmdir to satisfy the
controller invariant that a subsystem's ->css_offline() must not run while
tasks are still doing kernel-side work in the cgroup.

[1] d245698d727a ("cgroup: Defer task cgroup unlink until after the task is done switching out")
[2] a72f73c4dd9b ("cgroup: Don't expose dead tasks in cgroup")
[3] 1b164b876c36 ("cgroup: Wait for dying tasks to leave on rmdir")
[4] 4c56a8ac6869 ("cgroup: Fix cgroup_drain_dying() testing the wrong condition")
[5] 13e786b64bd3 ("cgroup: Increment nr_dying_subsys_* from rmdir context")

[1] moved task cset unlink from do_exit() to finish_task_switch() so a
task's cset link drops only after the task has fully stopped scheduling.
That made tasks past exit_signals() linger on cset->tasks until their final
context switch, which led to a series of problems as what userspace expected
to see after rmdir diverged from what the kernel needs to wait for. [2]-[5]
tried to bridge that divergence: [2] filtered the exiting tasks from
cgroup.procs; [3] had rmdir(2) sleep in TASK_UNINTERRUPTIBLE for them; [4]

    [54 lines not shown]
DeltaFile
+114-130kernel/cgroup/cgroup.c
+2-2include/linux/cgroup-defs.h
+116-1322 files

Linux/linux 088f65earch/x86/platform/efi quirks.c

x86/efi: Fix graceful fault handling after FPU softirq changes

Since commit d02198550423 ("x86/fpu: Improve crypto performance by
making kernel-mode FPU reliably usable in softirqs"), kernel_fpu_begin()
calls fpregs_lock() which uses local_bh_disable() instead of the
previous preempt_disable(). This sets SOFTIRQ_OFFSET in preempt_count
during the entire EFI runtime service call, causing in_interrupt() to
return true in normal task context.

The graceful page fault handler efi_crash_gracefully_on_page_fault()
uses in_interrupt() to bail out for faults in real interrupt context.
With SOFTIRQ_OFFSET now set, the handler always bails out, leaving EFI
firmware page faults unhandled. This escalates to die() which also sees
in_interrupt() as true and calls panic("Fatal exception in interrupt"),
resulting in a hard system freeze. On systems with buggy firmware that
triggers page faults during EFI runtime calls (e.g., accessing unmapped
memory in GetTime()), this causes an unrecoverable hang instead of the
expected graceful EFI_ABORTED recovery.


    [14 lines not shown]
DeltaFile
+1-1arch/x86/platform/efi/quirks.c
+1-11 files

Linux/linux 8de779ddrivers/video/fbdev udlfb.c, include/video udlfb.h

fbdev: udlfb: add vm_ops to dlfb_ops_mmap to prevent use-after-free

dlfb_ops_mmap() uses remap_pfn_range() to map vmalloc framebuffer pages
to userspace but sets no vm_ops on the VMA. This means the kernel cannot
track active mmaps. When dlfb_realloc_framebuffer() replaces the backing
buffer via FBIOPUT_VSCREENINFO, existing mmap PTEs are not invalidated.
On USB disconnect, dlfb_ops_destroy() calls vfree() on the old pages
while userspace PTEs still reference them, resulting in a use-after-free:
the process retains read/write access to freed kernel pages.

Add vm_operations_struct with open/close callbacks that maintain an
atomic mmap_count on struct dlfb_data. In dlfb_realloc_framebuffer(),
check mmap_count and return -EBUSY if the buffer is currently mapped,
preventing buffer replacement while userspace holds stale PTEs.

Tested with PoC using dummy_hcd + raw_gadget USB device emulation.

Signed-off-by: Rajat Gupta <rajgupt at qti.qualcomm.com>
Acked-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org>

    [2 lines not shown]
DeltaFile
+30-1drivers/video/fbdev/udlfb.c
+1-0include/video/udlfb.h
+31-12 files

Linux/linux 2433f3farch/loongarch/kvm interrupt.c

LoongArch: KVM: Fix HW timer interrupt lost when inject interrupt by software

With passthrough HW timer, timer interrupt is injected by HW. When
inject emulated CPU interrupt by software such SIP0/SIP1/IPI, HW timer
interrupt may be lost.

Here check whether there is timer tick value inversion before and after
injecting emulated CPU interrupt by software, timer enabling by reading
timer cfg register is skipped. If the timer tick value is detected with
changing, then timer should be enabled. And inject a timer interrupt by
software if there is.

Cc: <stable at vger.kernel.org>
Fixes: f45ad5b8aa93 ("LoongArch: KVM: Implement vcpu interrupt operations").
Signed-off-by: Bibo Mao <maobibo at loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+14-0arch/loongarch/kvm/interrupt.c
+14-01 files

Linux/linux 5a873d7arch/loongarch/kvm timer.c

LoongArch: KVM: Move unconditional delay into timer clear scenery

When timer interrupt arrives in guest kernel, guest kernel clears the
timer interrupt and program timer with the next incoming event.

During this stage, timer tick is -1 and timer interrupt status is
disabled in ESTAT register. KVM hypervisor need write zero with timer
tick register and wait timer interrupt injection from HW side, and
then clear timer interrupt.

So there is 2 cycle delay in KVM hypervisor to emulate such scenery,
and the delay is unnecessary if there is no need to clear the timer
interrupt.

Here move 2 cycle delay into timer clear scenery and add timer ESTAT
checking after delay, and set max timer expire value if timer interrupt
does not arrive still.

Cc: stable at vger.kernel.org

    [2 lines not shown]
DeltaFile
+8-2arch/loongarch/kvm/timer.c
+8-21 files

Linux/linux 6debfffarch/loongarch/kvm interrupt.c

LoongArch: KVM: Move AVEC interrupt injection into switch loop

When AVEC interrupt controller is emulated in user space, AVEC interrupt
is injected by software like SIP0/SIP1/TI/IPI interrupts. Here also move
the AVEC interrupt injection in switch loop.

Cc: stable at vger.kernel.org
Signed-off-by: Bibo Mao <maobibo at loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+9-11arch/loongarch/kvm/interrupt.c
+9-111 files

Linux/linux 81e1877arch/loongarch/kvm mmu.c

LoongArch: KVM: Use kvm_set_pte() in kvm_flush_pte()

kvm_flush_pte() is the only caller that directly assigns *pte instead
of using the kvm_set_pte() wrapper. Use the wrapper for consistency with
the rest of the file.

No functional change intended.

Cc: stable at vger.kernel.org
Reviewed-by: Bibo Mao <maobibo at loongson.cn>
Signed-off-by: Tao Cui <cuitao at kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+1-1arch/loongarch/kvm/mmu.c
+1-11 files

Linux/linux f26faaearch/loongarch/kvm exit.c

LoongArch: KVM: Fix missing EMULATE_FAIL in kvm_emu_mmio_read()

In the ldptr (0x24...0x27) opcode decoding path, the default case only
breaks out but without setting "ret" value to EMULATE_FAIL. This leaves
run->mmio.len uninitialized (stale from a previous MMIO operation) while
"ret" value remains EMULATE_DO_MMIO, causing the code to proceed with an
incorrect MMIO length.

Add "ret = EMULATE_FAIL" to match the other default branches in the same
function (e.g. the 0x28...0x2e and 0x38 cases).

Cc: stable at vger.kernel.org
Reviewed-by: Bibo Mao <maobibo at loongson.cn>
Signed-off-by: Tao Cui <cuitao at kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+1-0arch/loongarch/kvm/exit.c
+1-01 files

Linux/linux b323a44arch/loongarch/kvm switch.S

LoongArch: KVM: Fix "unreliable stack" for kvm_exc_entry

Insert the appropriate UNWIND hint into the kvm_exc_entry assembly
function to guide the generation of correct ORC table entries, thereby
solving the timeout problem ("unreliable stack") while loading the
livepatch-sample module on a physical machine running virtual machines
with multiple vcpus.

Cc: stable at vger.kernel.org
Signed-off-by: Xianglai Li <lixianglai at loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+1-1arch/loongarch/kvm/switch.S
+1-11 files

Linux/linux 5203012arch/loongarch Kbuild, arch/loongarch/include/asm asm-prototypes.h kvm_host.h

LoongArch: KVM: Compile switch.S directly into the kernel

If we directly compile the switch.S file into the kernel, the address of
the kvm_exc_entry function will definitely be within the DMW memory area.
Therefore, we will no longer need to perform a copy relocation of the
kvm_exc_entry.

So this patch compiles switch.S directly into the kernel, and then remove
the copy relocation execution logic for the kvm_exc_entry function.

Cc: stable at vger.kernel.org
Signed-off-by: Xianglai Li <lixianglai at loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+3-32arch/loongarch/kvm/main.c
+20-0arch/loongarch/include/asm/asm-prototypes.h
+15-5arch/loongarch/kvm/switch.S
+2-1arch/loongarch/kvm/Makefile
+0-3arch/loongarch/include/asm/kvm_host.h
+1-1arch/loongarch/Kbuild
+41-426 files

Linux/linux b3e31a6arch/loongarch/kvm vm.c

LoongArch: KVM: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS

It doesn't make sense to return the recommended maximum number of vCPUs
which exceeds the maximum possible number of vCPUs.

Other architectures have already done this, such as commit 57a2e13ebdda
("KVM: MIPS: Cap KVM_CAP_NR_VCPUS by KVM_CAP_MAX_VCPUS")

Cc: stable at vger.kernel.org
Reviewed-by: Bibo Mao <maobibo at loongson.cn>
Signed-off-by: Qiang Ma <maqianga at uniontech.com>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+1-1arch/loongarch/kvm/vm.c
+1-11 files

Linux/linux 7e2c41barch/loongarch/include/asm/vdso gettimeofday.h

LoongArch: vDSO: Drop custom __arch_vdso_hres_capable()

The custom definition is identical to the generic fallback one.

So remove it.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh at linutronix.de>
Signed-off-by: Huacai Chen <chenhuacai at loongson.cn>
DeltaFile
+0-6arch/loongarch/include/asm/vdso/gettimeofday.h
+0-61 files

Linux/linux 49f3384arch/loongarch/pci acpi.c

LoongArch: Use per-root-bridge PCIH flag to skip mem resource fixup

When firmware enables 64-bit PCI host bridge support, some root bridges
already provide valid 64-bit mem resource windows through ACPI.

In this case, the LoongArch-specific mem resource high-bits fixup in
acpi_prepare_root_resources() should not be applied unconditionally.
Otherwise, the kernel may override the native resource layout derived
from firmware, and later BAR assignment can fail to place device BARs
into the intended 64-bit address space correctly.

Add a per-root-bridge ACPI flag, PCIH, and evaluate it from the current
root bridge device scope. When PCIH is set, skip the mem resource high-
bits fixup path and let the kernel use the firmware-provided resource
description directly. When PCIH is absent or cleared, keep the existing
behavior and continue filling the high address bits from the host bridge
address.

This makes the behavior per-root-bridge configurable and avoids breaking

    [8 lines not shown]
DeltaFile
+5-0arch/loongarch/pci/acpi.c
+5-01 files

Linux/linux 8dfa2f8arch/loongarch/pci pci.c

LoongArch: Fix potential ADE in loongson_gpu_fixup_dma_hang()

The switch case in loongson_gpu_fixup_dma_hang() may not DC2 or DC3, and
readl(crtc_reg) will access with random address, because the "device" is
from "base+PCI_DEVICE_ID", "base" is from "pdev->devfn+1". This is wrong
when my platform inserts a discrete GPU:

lspci -tv
-[0000:00]-+-00.0  Loongson Technology LLC Hyper Transport Bridge Controller
...
           +-06.0  Loongson Technology LLC LG100 GPU
           +-06.2  Loongson Technology LLC Device 7a37
...

Add a default switch case to fix the panic as below:

 Kernel ade access[#1]:
 CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.6.136-loong64-desktop-hwe+ #4
 pc 90000000017e5534 ra 90000000017e54c0 tp 90000001002f8000 sp 90000001002fb6c0

    [60 lines not shown]
DeltaFile
+3-0arch/loongarch/pci/pci.c
+3-01 files