0.0
NA
CVE-2023-53351
drm/sched: Check scheduler work queue before calling timeout handling
Description

In the Linux kernel, the following vulnerability has been resolved: drm/sched: Check scheduler work queue before calling timeout handling During an IGT GPU reset test we see again oops despite of commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling timeout handling). It uses ready condition whether to call drm_sched_fault which unwind the TDR leads to GPU reset. However it looks the ready condition is overloaded with other meanings, for example, for the following stack is related GPU reset : 0 gfx_v9_0_cp_gfx_start 1 gfx_v9_0_cp_gfx_resume 2 gfx_v9_0_cp_resume 3 gfx_v9_0_hw_init 4 gfx_v9_0_resume 5 amdgpu_device_ip_resume_phase2 does the following: /* start the ring */ gfx_v9_0_cp_gfx_start(adev); ring->sched.ready = true; The same approach is for other ASICs as well : gfx_v8_0_cp_gfx_resume gfx_v10_0_kiq_resume, etc... As a result, our GPU reset test causes GPU fault which calls unconditionally gfx_v9_0_fault and then drm_sched_fault. However now it depends on whether the interrupt service routine drm_sched_fault is executed after gfx_v9_0_cp_gfx_start is completed which sets the ready field of the scheduler to true even for uninitialized schedulers and causes oops vs no fault or when ISR drm_sched_fault is completed prior gfx_v9_0_cp_gfx_start and NULL pointer dereference does not occur. Use the field timeout_wq to prevent oops for uninitialized schedulers. The field could be initialized by the work queue of resetting the domain. v1: Corrections to commit message (Luben)

INFO

Published Date :

Sept. 17, 2025, 3:15 p.m.

Last Modified :

Sept. 17, 2025, 3:15 p.m.

Remotely Exploit :

No

Source :

416baaa9-dc9f-4396-8d5f-8c081fb06d67
Affected Products

The following products are affected by CVE-2023-53351 vulnerability. Even if cvefeed.io is aware of the exact versions of the products that are affected, the information is not represented in the table below.

No affected product recoded yet

Solution
Update the Linux kernel to fix scheduler work queue handling for GPU resets.
  • Update the Linux kernel to the latest version.
  • Apply the specific patch for drm/sched.
  • Verify scheduler initialization before timeout handling.
  • Ensure interrupt service routine timing is correct.
References to Advisories, Solutions, and Tools

Here, you will find a curated list of external links that provide in-depth information, practical solutions, and valuable tools related to CVE-2023-53351.

URL Resource
https://git.kernel.org/stable/c/2da5bffe9eaa5819a868e8eaaa11b3fd0f16a691
https://git.kernel.org/stable/c/c43a96fc00b662cef1ef0eb22d40441ce2abae8f
CWE - Common Weakness Enumeration

While CVE identifies specific instances of vulnerabilities, CWE categorizes the common flaws or weaknesses that can lead to vulnerabilities. CVE-2023-53351 is associated with the following CWEs:

Common Attack Pattern Enumeration and Classification (CAPEC)

Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of the common attributes and approaches employed by adversaries to exploit the CVE-2023-53351 weaknesses.

We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).

Results are limited to the first 15 repositories due to potential performance issues.

The following list is the news that have been mention CVE-2023-53351 vulnerability anywhere in the article.

The following table lists the changes that have been made to the CVE-2023-53351 vulnerability over time.

Vulnerability history details can be useful for understanding the evolution of a vulnerability, and for identifying the most recent changes that may impact the vulnerability's severity, exploitability, or other characteristics.

  • New CVE Received by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

    Sep. 17, 2025

    Action Type Old Value New Value
    Added Description In the Linux kernel, the following vulnerability has been resolved: drm/sched: Check scheduler work queue before calling timeout handling During an IGT GPU reset test we see again oops despite of commit 0c8c901aaaebc9 (drm/sched: Check scheduler ready before calling timeout handling). It uses ready condition whether to call drm_sched_fault which unwind the TDR leads to GPU reset. However it looks the ready condition is overloaded with other meanings, for example, for the following stack is related GPU reset : 0 gfx_v9_0_cp_gfx_start 1 gfx_v9_0_cp_gfx_resume 2 gfx_v9_0_cp_resume 3 gfx_v9_0_hw_init 4 gfx_v9_0_resume 5 amdgpu_device_ip_resume_phase2 does the following: /* start the ring */ gfx_v9_0_cp_gfx_start(adev); ring->sched.ready = true; The same approach is for other ASICs as well : gfx_v8_0_cp_gfx_resume gfx_v10_0_kiq_resume, etc... As a result, our GPU reset test causes GPU fault which calls unconditionally gfx_v9_0_fault and then drm_sched_fault. However now it depends on whether the interrupt service routine drm_sched_fault is executed after gfx_v9_0_cp_gfx_start is completed which sets the ready field of the scheduler to true even for uninitialized schedulers and causes oops vs no fault or when ISR drm_sched_fault is completed prior gfx_v9_0_cp_gfx_start and NULL pointer dereference does not occur. Use the field timeout_wq to prevent oops for uninitialized schedulers. The field could be initialized by the work queue of resetting the domain. v1: Corrections to commit message (Luben)
    Added Reference https://git.kernel.org/stable/c/2da5bffe9eaa5819a868e8eaaa11b3fd0f16a691
    Added Reference https://git.kernel.org/stable/c/c43a96fc00b662cef1ef0eb22d40441ce2abae8f
EPSS is a daily estimate of the probability of exploitation activity being observed over the next 30 days. Following chart shows the EPSS score history of the vulnerability.
Vulnerability Scoring Details
No CVSS metrics available for this vulnerability.