CVE-2023-53351
drm/sched: Check scheduler work queue before calling timeout handling
Description
In the Linux kernel, the following vulnerability has been resolved: drm/sched: Check scheduler work queue before calling timeout handling During an IGT GPU reset test we see again oops despite of commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling timeout handling). It uses ready condition whether to call drm_sched_fault which unwind the TDR leads to GPU reset. However it looks the ready condition is overloaded with other meanings, for example, for the following stack is related GPU reset : 0 gfx_v9_0_cp_gfx_start 1 gfx_v9_0_cp_gfx_resume 2 gfx_v9_0_cp_resume 3 gfx_v9_0_hw_init 4 gfx_v9_0_resume 5 amdgpu_device_ip_resume_phase2 does the following: /* start the ring */ gfx_v9_0_cp_gfx_start(adev); ring->sched.ready = true; The same approach is for other ASICs as well : gfx_v8_0_cp_gfx_resume gfx_v10_0_kiq_resume, etc... As a result, our GPU reset test causes GPU fault which calls unconditionally gfx_v9_0_fault and then drm_sched_fault. However now it depends on whether the interrupt service routine drm_sched_fault is executed after gfx_v9_0_cp_gfx_start is completed which sets the ready field of the scheduler to true even for uninitialized schedulers and causes oops vs no fault or when ISR drm_sched_fault is completed prior gfx_v9_0_cp_gfx_start and NULL pointer dereference does not occur. Use the field timeout_wq to prevent oops for uninitialized schedulers. The field could be initialized by the work queue of resetting the domain. v1: Corrections to commit message (Luben)
INFO
Published Date :
Sept. 17, 2025, 3:15 p.m.
Last Modified :
Sept. 17, 2025, 3:15 p.m.
Remotely Exploit :
No
Source :
416baaa9-dc9f-4396-8d5f-8c081fb06d67
Affected Products
The following products are affected by CVE-2023-53351
vulnerability.
Even if cvefeed.io
is aware of the exact versions of the
products
that
are
affected, the information is not represented in the table below.
No affected product recoded yet
Solution
- Update the Linux kernel to the latest version.
- Apply the specific patch for drm/sched.
- Verify scheduler initialization before timeout handling.
- Ensure interrupt service routine timing is correct.
References to Advisories, Solutions, and Tools
Here, you will find a curated list of external links that provide in-depth
information, practical solutions, and valuable tools related to
CVE-2023-53351
.
URL | Resource |
---|---|
https://git.kernel.org/stable/c/2da5bffe9eaa5819a868e8eaaa11b3fd0f16a691 | |
https://git.kernel.org/stable/c/c43a96fc00b662cef1ef0eb22d40441ce2abae8f |
CWE - Common Weakness Enumeration
While CVE identifies
specific instances of vulnerabilities, CWE categorizes the common flaws or
weaknesses that can lead to vulnerabilities. CVE-2023-53351
is
associated with the following CWEs:
Common Attack Pattern Enumeration and Classification (CAPEC)
Common Attack Pattern Enumeration and Classification
(CAPEC)
stores attack patterns, which are descriptions of the common attributes and
approaches employed by adversaries to exploit the CVE-2023-53351
weaknesses.
We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).
Results are limited to the first 15 repositories due to potential performance issues.
The following list is the news that have been mention
CVE-2023-53351
vulnerability anywhere in the article.
The following table lists the changes that have been made to the
CVE-2023-53351
vulnerability over time.
Vulnerability history details can be useful for understanding the evolution of a vulnerability, and for identifying the most recent changes that may impact the vulnerability's severity, exploitability, or other characteristics.
-
New CVE Received by 416baaa9-dc9f-4396-8d5f-8c081fb06d67
Sep. 17, 2025
Action Type Old Value New Value Added Description In the Linux kernel, the following vulnerability has been resolved: drm/sched: Check scheduler work queue before calling timeout handling During an IGT GPU reset test we see again oops despite of commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling timeout handling). It uses ready condition whether to call drm_sched_fault which unwind the TDR leads to GPU reset. However it looks the ready condition is overloaded with other meanings, for example, for the following stack is related GPU reset : 0 gfx_v9_0_cp_gfx_start 1 gfx_v9_0_cp_gfx_resume 2 gfx_v9_0_cp_resume 3 gfx_v9_0_hw_init 4 gfx_v9_0_resume 5 amdgpu_device_ip_resume_phase2 does the following: /* start the ring */ gfx_v9_0_cp_gfx_start(adev); ring->sched.ready = true; The same approach is for other ASICs as well : gfx_v8_0_cp_gfx_resume gfx_v10_0_kiq_resume, etc... As a result, our GPU reset test causes GPU fault which calls unconditionally gfx_v9_0_fault and then drm_sched_fault. However now it depends on whether the interrupt service routine drm_sched_fault is executed after gfx_v9_0_cp_gfx_start is completed which sets the ready field of the scheduler to true even for uninitialized schedulers and causes oops vs no fault or when ISR drm_sched_fault is completed prior gfx_v9_0_cp_gfx_start and NULL pointer dereference does not occur. Use the field timeout_wq to prevent oops for uninitialized schedulers. The field could be initialized by the work queue of resetting the domain. v1: Corrections to commit message (Luben) Added Reference https://git.kernel.org/stable/c/2da5bffe9eaa5819a868e8eaaa11b3fd0f16a691 Added Reference https://git.kernel.org/stable/c/c43a96fc00b662cef1ef0eb22d40441ce2abae8f