7.0
HIGH
CVE-2024-26976
Here is the title for the vulnerability: "KVM Kernel Module Unload Deadlock and Page Table Leak"
Description

In the Linux kernel, the following vulnerability has been resolved: KVM: Always flush async #PF workqueue when vCPU is being destroyed Always flush the per-vCPU async #PF workqueue when a vCPU is clearing its completion queue, e.g. when a VM and all its vCPUs is being destroyed. KVM must ensure that none of its workqueue callbacks is running when the last reference to the KVM _module_ is put. Gifting a reference to the associated VM prevents the workqueue callback from dereferencing freed vCPU/VM memory, but does not prevent the KVM module from being unloaded before the callback completes. Drop the misguided VM refcount gifting, as calling kvm_put_kvm() from async_pf_execute() if kvm_put_kvm() flushes the async #PF workqueue will result in deadlock. async_pf_execute() can't return until kvm_put_kvm() finishes, and kvm_put_kvm() can't return until async_pf_execute() finishes: WARNING: CPU: 8 PID: 251 at virt/kvm/kvm_main.c:1435 kvm_put_kvm+0x2d/0x320 [kvm] Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel kvm irqbypass CPU: 8 PID: 251 Comm: kworker/8:1 Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: events async_pf_execute [kvm] RIP: 0010:kvm_put_kvm+0x2d/0x320 [kvm] Call Trace: <TASK> async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- INFO: task kworker/8:1:251 blocked for more than 120 seconds. Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/8:1 state:D stack:0 pid:251 ppid:2 flags:0x00004000 Workqueue: events async_pf_execute [kvm] Call Trace: <TASK> __schedule+0x33f/0xa40 schedule+0x53/0xc0 schedule_timeout+0x12a/0x140 __wait_for_common+0x8d/0x1d0 __flush_work.isra.0+0x19f/0x2c0 kvm_clear_async_pf_completion_queue+0x129/0x190 [kvm] kvm_arch_destroy_vm+0x78/0x1b0 [kvm] kvm_put_kvm+0x1c1/0x320 [kvm] async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK> If kvm_clear_async_pf_completion_queue() actually flushes the workqueue, then there's no need to gift async_pf_execute() a reference because all invocations of async_pf_execute() will be forced to complete before the vCPU and its VM are destroyed/freed. And that in turn fixes the module unloading bug as __fput() won't do module_put() on the last vCPU reference until the vCPU has been freed, e.g. if closing the vCPU file also puts the last reference to the KVM module. Note that kvm_check_async_pf_completion() may also take the work item off the completion queue and so also needs to flush the work queue, as the work will not be seen by kvm_clear_async_pf_completion_queue(). Waiting on the workqueue could theoretically delay a vCPU due to waiting for the work to complete, but that's a very, very small chance, and likely a very small delay. kvm_arch_async_page_present_queued() unconditionally makes a new request, i.e. will effectively delay entering the guest, so the remaining work is really just: trace_kvm_async_pf_completed(addr, cr2_or_gpa); __kvm_vcpu_wake_up(vcpu); mmput(mm); and mmput() can't drop the last reference to the page tables if the vCPU is still alive, i.e. the vCPU won't get stuck tearing down page tables. Add a helper to do the flushing, specifically to deal with "wakeup all" work items, as they aren't actually work items, i.e. are never placed in a workqueue. Trying to flush a bogus workqueue entry rightly makes __flush_work() complain (kudos to whoever added that sanity check). Note, commit 5f6de5cbebee ("KVM: Prevent module exit until al ---truncated---

INFO

Published Date :

May 1, 2024, 6:15 a.m.

Last Modified :

July 3, 2024, 1:50 a.m.

Source :

416baaa9-dc9f-4396-8d5f-8c081fb06d67

Remotely Exploitable :

No

Impact Score :

5.9

Exploitability Score :

1.0
Affected Products

The following products are affected by CVE-2024-26976 vulnerability. Even if cvefeed.io is aware of the exact versions of the products that are affected, the information is not represented in the table below.

ID Vendor Product Action
1 Linux linux_kernel

We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).

Results are limited to the first 15 repositories due to potential performance issues.

The following list is the news that have been mention CVE-2024-26976 vulnerability anywhere in the article.

The following table lists the changes that have been made to the CVE-2024-26976 vulnerability over time.

Vulnerability history details can be useful for understanding the evolution of a vulnerability, and for identifying the most recent changes that may impact the vulnerability's severity, exploitability, or other characteristics.

  • CVE Modified by 134c704f-9b21-4f2e-91b3-4a467353bcc0

    Jul. 03, 2024

    Action Type Old Value New Value
    Added CWE CISA-ADP CWE-400
    Added CVSS V3.1 CISA-ADP AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:H/A:H
  • CVE Modified by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

    Jun. 27, 2024

    Action Type Old Value New Value
    Added Reference kernel.org https://lists.debian.org/debian-lts-announce/2024/06/msg00020.html [No types assigned]
  • CVE Modified by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

    Jun. 25, 2024

    Action Type Old Value New Value
    Added Reference kernel.org https://lists.debian.org/debian-lts-announce/2024/06/msg00017.html [No types assigned]
  • CVE Modified by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

    May. 29, 2024

    Action Type Old Value New Value
  • CVE Modified by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

    May. 14, 2024

    Action Type Old Value New Value
  • CVE Received by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

    May. 01, 2024

    Action Type Old Value New Value
    Added Description In the Linux kernel, the following vulnerability has been resolved: KVM: Always flush async #PF workqueue when vCPU is being destroyed Always flush the per-vCPU async #PF workqueue when a vCPU is clearing its completion queue, e.g. when a VM and all its vCPUs is being destroyed. KVM must ensure that none of its workqueue callbacks is running when the last reference to the KVM _module_ is put. Gifting a reference to the associated VM prevents the workqueue callback from dereferencing freed vCPU/VM memory, but does not prevent the KVM module from being unloaded before the callback completes. Drop the misguided VM refcount gifting, as calling kvm_put_kvm() from async_pf_execute() if kvm_put_kvm() flushes the async #PF workqueue will result in deadlock. async_pf_execute() can't return until kvm_put_kvm() finishes, and kvm_put_kvm() can't return until async_pf_execute() finishes: WARNING: CPU: 8 PID: 251 at virt/kvm/kvm_main.c:1435 kvm_put_kvm+0x2d/0x320 [kvm] Modules linked in: vhost_net vhost vhost_iotlb tap kvm_intel kvm irqbypass CPU: 8 PID: 251 Comm: kworker/8:1 Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: events async_pf_execute [kvm] RIP: 0010:kvm_put_kvm+0x2d/0x320 [kvm] Call Trace: <TASK> async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- INFO: task kworker/8:1:251 blocked for more than 120 seconds. Tainted: G W 6.6.0-rc1-e7af8d17224a-x86/gmem-vm #119 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/8:1 state:D stack:0 pid:251 ppid:2 flags:0x00004000 Workqueue: events async_pf_execute [kvm] Call Trace: <TASK> __schedule+0x33f/0xa40 schedule+0x53/0xc0 schedule_timeout+0x12a/0x140 __wait_for_common+0x8d/0x1d0 __flush_work.isra.0+0x19f/0x2c0 kvm_clear_async_pf_completion_queue+0x129/0x190 [kvm] kvm_arch_destroy_vm+0x78/0x1b0 [kvm] kvm_put_kvm+0x1c1/0x320 [kvm] async_pf_execute+0x198/0x260 [kvm] process_one_work+0x145/0x2d0 worker_thread+0x27e/0x3a0 kthread+0xba/0xe0 ret_from_fork+0x2d/0x50 ret_from_fork_asm+0x11/0x20 </TASK> If kvm_clear_async_pf_completion_queue() actually flushes the workqueue, then there's no need to gift async_pf_execute() a reference because all invocations of async_pf_execute() will be forced to complete before the vCPU and its VM are destroyed/freed. And that in turn fixes the module unloading bug as __fput() won't do module_put() on the last vCPU reference until the vCPU has been freed, e.g. if closing the vCPU file also puts the last reference to the KVM module. Note that kvm_check_async_pf_completion() may also take the work item off the completion queue and so also needs to flush the work queue, as the work will not be seen by kvm_clear_async_pf_completion_queue(). Waiting on the workqueue could theoretically delay a vCPU due to waiting for the work to complete, but that's a very, very small chance, and likely a very small delay. kvm_arch_async_page_present_queued() unconditionally makes a new request, i.e. will effectively delay entering the guest, so the remaining work is really just: trace_kvm_async_pf_completed(addr, cr2_or_gpa); __kvm_vcpu_wake_up(vcpu); mmput(mm); and mmput() can't drop the last reference to the page tables if the vCPU is still alive, i.e. the vCPU won't get stuck tearing down page tables. Add a helper to do the flushing, specifically to deal with "wakeup all" work items, as they aren't actually work items, i.e. are never placed in a workqueue. Trying to flush a bogus workqueue entry rightly makes __flush_work() complain (kudos to whoever added that sanity check). Note, commit 5f6de5cbebee ("KVM: Prevent module exit until al ---truncated---
    Added Reference kernel.org https://git.kernel.org/stable/c/ab2c2f5d9576112ad22cfd3798071cb74693b1f5 [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/82e25cc1c2e93c3023da98be282322fc08b61ffb [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/f8730d6335e5f43d09151fca1f0f41922209a264 [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/83d3c5e309611ef593e2fcb78444fc8ceedf9bac [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/b54478d20375874aeee257744dedfd3e413432ff [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/a75afe480d4349c524d9c659b1a5a544dbc39a98 [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/4f3a3bce428fb439c66a578adc447afce7b4a750 [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/caa9af2e27c275e089d702cfbaaece3b42bca31b [No types assigned]
    Added Reference kernel.org https://git.kernel.org/stable/c/3d75b8aa5c29058a512db29da7cbee8052724157 [No types assigned]
EPSS is a daily estimate of the probability of exploitation activity being observed over the next 30 days. Following chart shows the EPSS score history of the vulnerability.
CWE - Common Weakness Enumeration

While CVE identifies specific instances of vulnerabilities, CWE categorizes the common flaws or weaknesses that can lead to vulnerabilities. CVE-2024-26976 is associated with the following CWEs:

Common Attack Pattern Enumeration and Classification (CAPEC)

Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of the common attributes and approaches employed by adversaries to exploit the CVE-2024-26976 weaknesses.

CVSS31 - Vulnerability Scoring System
Attack Vector
Attack Complexity
Privileges Required
User Interaction
Scope
Confidentiality
Integrity
Availability