CVE-2026-43404
mm: Fix a hmm_range_fault() livelock / starvation problem
Description
In the Linux kernel, the following vulnerability has been resolved: mm: Fix a hmm_range_fault() livelock / starvation problem If hmm_range_fault() fails a folio_trylock() in do_swap_page, trying to acquire the lock of a device-private folio for migration, to ram, the function will spin until it succeeds grabbing the lock. However, if the process holding the lock is depending on a work item to be completed, which is scheduled on the same CPU as the spinning hmm_range_fault(), that work item might be starved and we end up in a livelock / starvation situation which is never resolved. This can happen, for example if the process holding the device-private folio lock is stuck in migrate_device_unmap()->lru_add_drain_all() sinc lru_add_drain_all() requires a short work-item to be run on all online cpus to complete. A prerequisite for this to happen is: a) Both zone device and system memory folios are considered in migrate_device_unmap(), so that there is a reason to call lru_add_drain_all() for a system memory folio while a folio lock is held on a zone device folio. b) The zone device folio has an initial mapcount > 1 which causes at least one migration PTE entry insertion to be deferred to try_to_migrate(), which can happen after the call to lru_add_drain_all(). c) No or voluntary only preemption. This all seems pretty unlikely to happen, but indeed is hit by the "xe_exec_system_allocator" igt test. Resolve this by waiting for the folio to be unlocked if the folio_trylock() fails in do_swap_page(). Rename migration_entry_wait_on_locked() to softleaf_entry_wait_unlock() and update its documentation to indicate the new use-case. Future code improvements might consider moving the lru_add_drain_all() call in migrate_device_unmap() to be called *after* all pages have migration entries inserted. That would eliminate also b) above. v2: - Instead of a cond_resched() in hmm_range_fault(), eliminate the problem by waiting for the folio to be unlocked in do_swap_page() (Alistair Popple, Andrew Morton) v3: - Add a stub migration_entry_wait_on_locked() for the !CONFIG_MIGRATION case. (Kernel Test Robot) v4: - Rename migrate_entry_wait_on_locked() to softleaf_entry_wait_on_locked() and update docs (Alistair Popple) v5: - Add a WARN_ON_ONCE() for the !CONFIG_MIGRATION version of softleaf_entry_wait_on_locked(). - Modify wording around function names in the commit message (Andrew Morton) (cherry picked from commit a69d1ab971a624c6f112cea61536569d579c3215)
INFO
Published Date :
May 8, 2026, 3:16 p.m.
Last Modified :
May 8, 2026, 3:16 p.m.
Remotely Exploit :
No
Source :
416baaa9-dc9f-4396-8d5f-8c081fb06d67
Affected Products
The following products are affected by CVE-2026-43404
vulnerability.
Even if cvefeed.io is aware of the exact versions of the
products
that
are
affected, the information is not represented in the table below.
No affected product recoded yet
Solution
- Wait for folio to be unlocked if trylock fails.
- Rename migration function and update documentation.
- Consider moving lru_add_drain_all() call later.
- Add stub migration entry wait for !CONFIG_MIGRATION.
References to Advisories, Solutions, and Tools
Here, you will find a curated list of external links that provide in-depth
information, practical solutions, and valuable tools related to
CVE-2026-43404.
CWE - Common Weakness Enumeration
While CVE identifies
specific instances of vulnerabilities, CWE categorizes the common flaws or
weaknesses that can lead to vulnerabilities. CVE-2026-43404 is
associated with the following CWEs:
Common Attack Pattern Enumeration and Classification (CAPEC)
Common Attack Pattern Enumeration and Classification
(CAPEC)
stores attack patterns, which are descriptions of the common attributes and
approaches employed by adversaries to exploit the CVE-2026-43404
weaknesses.
We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).
Results are limited to the first 15 repositories due to potential performance issues.
The following list is the news that have been mention
CVE-2026-43404 vulnerability anywhere in the article.
The following table lists the changes that have been made to the
CVE-2026-43404 vulnerability over time.
Vulnerability history details can be useful for understanding the evolution of a vulnerability, and for identifying the most recent changes that may impact the vulnerability's severity, exploitability, or other characteristics.
-
New CVE Received by 416baaa9-dc9f-4396-8d5f-8c081fb06d67
May. 08, 2026
Action Type Old Value New Value Added Description In the Linux kernel, the following vulnerability has been resolved: mm: Fix a hmm_range_fault() livelock / starvation problem If hmm_range_fault() fails a folio_trylock() in do_swap_page, trying to acquire the lock of a device-private folio for migration, to ram, the function will spin until it succeeds grabbing the lock. However, if the process holding the lock is depending on a work item to be completed, which is scheduled on the same CPU as the spinning hmm_range_fault(), that work item might be starved and we end up in a livelock / starvation situation which is never resolved. This can happen, for example if the process holding the device-private folio lock is stuck in migrate_device_unmap()->lru_add_drain_all() sinc lru_add_drain_all() requires a short work-item to be run on all online cpus to complete. A prerequisite for this to happen is: a) Both zone device and system memory folios are considered in migrate_device_unmap(), so that there is a reason to call lru_add_drain_all() for a system memory folio while a folio lock is held on a zone device folio. b) The zone device folio has an initial mapcount > 1 which causes at least one migration PTE entry insertion to be deferred to try_to_migrate(), which can happen after the call to lru_add_drain_all(). c) No or voluntary only preemption. This all seems pretty unlikely to happen, but indeed is hit by the "xe_exec_system_allocator" igt test. Resolve this by waiting for the folio to be unlocked if the folio_trylock() fails in do_swap_page(). Rename migration_entry_wait_on_locked() to softleaf_entry_wait_unlock() and update its documentation to indicate the new use-case. Future code improvements might consider moving the lru_add_drain_all() call in migrate_device_unmap() to be called *after* all pages have migration entries inserted. That would eliminate also b) above. v2: - Instead of a cond_resched() in hmm_range_fault(), eliminate the problem by waiting for the folio to be unlocked in do_swap_page() (Alistair Popple, Andrew Morton) v3: - Add a stub migration_entry_wait_on_locked() for the !CONFIG_MIGRATION case. (Kernel Test Robot) v4: - Rename migrate_entry_wait_on_locked() to softleaf_entry_wait_on_locked() and update docs (Alistair Popple) v5: - Add a WARN_ON_ONCE() for the !CONFIG_MIGRATION version of softleaf_entry_wait_on_locked(). - Modify wording around function names in the commit message (Andrew Morton) (cherry picked from commit a69d1ab971a624c6f112cea61536569d579c3215) Added Reference https://git.kernel.org/stable/c/7e6e2fc91d4b9b12ec6e137019532568ebcf2680 Added Reference https://git.kernel.org/stable/c/94b6d0ba4b640ba23bb6c708a59316e74e5ede63 Added Reference https://git.kernel.org/stable/c/b570f37a2ce480be26c665345c5514686a8a0274