CVE-2024-35931

5.5

MEDIUM CVSS 3.1

drm/amdgpu: Skip do PCI error slot reset during RAS recovery

Overview

Description

In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: Skip do PCI error slot reset during RAS recovery Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic #101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.

Details

INFO

Published Date :

May 19, 2024, 11:15 a.m.

Last Modified :

Sept. 24, 2025, 6:36 p.m.

Remotely Exploit :

No

Source :

416baaa9-dc9f-4396-8d5f-8c081fb06d67

Impact

Affected Products

The following products are affected by CVE-2024-35931 vulnerability. Even if cvefeed.io is aware of the exact versions of the products that are affected, the information is not represented in the table below.

ID	Vendor	Product	Action
1	Linux	linux_kernel

: Total Affected Vendor : 1 | Products : 1

Scoring

CVSS Scores

The Common Vulnerability Scoring System is a standardized framework for assessing the severity of vulnerabilities in software and systems. We collect and displays CVSS scores from various sources for each CVE.

Score	Version	Severity	Vector	Exploitability Score	Impact Score	Source
	CVSS 3.1	MEDIUM				[email protected]

Solution

Updating the kernel package is recommended to address the AMDGPU PCI error.

Update the affected kernel packages.
Reboot the system to apply the updates.

Public PoC/Exploit Available at Github

CVE-2024-35931 has a 1 public PoC/Exploit available at Github. Go to the Public Exploits tab to see the list.

References

References to Advisories, Solutions, and Tools

Here, you will find a curated list of external links that provide in-depth information, practical solutions, and valuable tools related to CVE-2024-35931.

URL	Resource
https://git.kernel.org/stable/c/395ca1031acf89d8ecb26127c544a71688d96f35	Patch
https://git.kernel.org/stable/c/601429cca96b4af3be44172c3b64e4228515dbe1	Patch
https://git.kernel.org/stable/c/395ca1031acf89d8ecb26127c544a71688d96f35	Patch
https://git.kernel.org/stable/c/601429cca96b4af3be44172c3b64e4228515dbe1	Patch

CWE - Common Weakness Enumeration

While CVE identifies specific instances of vulnerabilities, CWE categorizes the common flaws or weaknesses that can lead to vulnerabilities. CVE-2024-35931 is associated with the following CWEs:

Common Attack Pattern Enumeration and Classification (CAPEC)

Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of the common attributes and approaches employed by adversaries to exploit the CVE-2024-35931 weaknesses.

We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).

ARPSyndicate/cve-scores

EPSS & VEDAS Score Aggregator for CVEs

cve vulnerability exploit epss vedas exploit-maturity

Updated: 4 months, 1 week ago

258 stars 38 fork 38 watcher

Born at : April 13, 2021, 4:50 a.m. This repo has been linked 150 different CVEs too.

Results are limited to the first 15 repositories due to potential performance issues.

The following list is the news that have been mention CVE-2024-35931 vulnerability anywhere in the article.

Results are limited to the first 20 news articles due to potential performance issues.

The following table lists the changes that have been made to the CVE-2024-35931 vulnerability over time.

Vulnerability history details can be useful for understanding the evolution of a vulnerability, and for identifying the most recent changes that may impact the vulnerability's severity, exploitability, or other characteristics.

Initial Analysis by [email protected]

Sep. 24, 2025

Action	Type	New Value
Added	CVSS V3.1	AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H
Added	CWE	NVD-CWE-noinfo
Added	CPE Configuration	OR cpe:2.3:o:linux:linux_kernel::::::::* versions from (including) 4.2 up to (excluding) 6.8.6
Added	Reference Type	kernel.org: https://git.kernel.org/stable/c/395ca1031acf89d8ecb26127c544a71688d96f35 Types: Patch
Added	Reference Type	kernel.org: https://git.kernel.org/stable/c/601429cca96b4af3be44172c3b64e4228515dbe1 Types: Patch
Added	Reference Type	CVE: https://git.kernel.org/stable/c/395ca1031acf89d8ecb26127c544a71688d96f35 Types: Patch
Added	Reference Type	CVE: https://git.kernel.org/stable/c/601429cca96b4af3be44172c3b64e4228515dbe1 Types: Patch

CVE Modified by af854a3a-2127-422b-91ae-364da2661108

Nov. 21, 2024

Action	Type	Old Value	New Value
Added	Reference		https://git.kernel.org/stable/c/395ca1031acf89d8ecb26127c544a71688d96f35
Added	Reference		https://git.kernel.org/stable/c/601429cca96b4af3be44172c3b64e4228515dbe1

CVE Modified by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

May. 29, 2024

Action	Type	Old Value	New Value

CVE Received by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

May. 19, 2024

Action	Type	New Value
Added	Description	In the Linux kernel, the following vulnerability has been resolved: drm/amdgpu: Skip do PCI error slot reset during RAS recovery Why: The PCI error slot reset maybe triggered after inject ue to UMC multi times, this caused system hang. [ 557.371857] amdgpu 0000:af:00.0: amdgpu: GPU reset succeeded, trying to resume [ 557.373718] [drm] PCIE GART of 512M enabled. [ 557.373722] [drm] PTB located at 0x0000031FED700000 [ 557.373788] [drm] VRAM is lost due to GPU reset! [ 557.373789] [drm] PSP is resuming... [ 557.547012] mlx5_core 0000:55:00.0: mlx5_pci_err_detected Device state = 1 pci_status: 0. Exit, result = 3, need reset [ 557.547067] [drm] PCI error: detected callback, state(1)!! [ 557.547069] [drm] No support for XGMI hive yet... [ 557.548125] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 0. Enter [ 557.607763] mlx5_core 0000:55:00.0: wait vital counter value 0x16b5b after 1 iterations [ 557.607777] mlx5_core 0000:55:00.0: mlx5_pci_slot_reset Device state = 1 pci_status: 1. Exit, err = 0, result = 5, recovered [ 557.610492] [drm] PCI error: slot reset callback!! ... [ 560.689382] amdgpu 0000:3f:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689546] amdgpu 0000:5a:00.0: amdgpu: GPU reset(2) succeeded! [ 560.689562] general protection fault, probably for non-canonical address 0x5f080b54534f611f: 0000 [#1] SMP NOPTI [ 560.701008] CPU: 16 PID: 2361 Comm: kworker/u448:9 Tainted: G OE 5.15.0-91-generic #101-Ubuntu [ 560.712057] Hardware name: Microsoft C278A/C278A, BIOS C2789.5.BS.1C11.AG.1 11/08/2023 [ 560.720959] Workqueue: amdgpu-reset-hive amdgpu_ras_do_recovery [amdgpu] [ 560.728887] RIP: 0010:amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.736891] Code: ff 41 89 c6 e9 1b ff ff ff 44 0f b6 45 b0 e9 4f ff ff ff be 01 00 00 00 4c 89 e7 e8 76 c9 8b ff 44 0f b6 45 b0 e9 3c fd ff ff <48> 83 ba 18 02 00 00 00 0f 84 6a f8 ff ff 48 8d 7a 78 be 01 00 00 [ 560.757967] RSP: 0018:ffa0000032e53d80 EFLAGS: 00010202 [ 560.763848] RAX: ffa00000001dfd10 RBX: ffa0000000197090 RCX: ffa0000032e53db0 [ 560.771856] RDX: 5f080b54534f5f07 RSI: 0000000000000000 RDI: ff11000128100010 [ 560.779867] RBP: ffa0000032e53df0 R08: 0000000000000000 R09: ffffffffffe77f08 [ 560.787879] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000 [ 560.795889] R13: ffa0000032e53e00 R14: 0000000000000000 R15: 0000000000000000 [ 560.803889] FS: 0000000000000000(0000) GS:ff11007e7e800000(0000) knlGS:0000000000000000 [ 560.812973] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 560.819422] CR2: 000055a04c118e68 CR3: 0000000007410005 CR4: 0000000000771ee0 [ 560.827433] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 560.835433] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 560.843444] PKRU: 55555554 [ 560.846480] Call Trace: [ 560.849225] <TASK> [ 560.851580] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.856488] ? show_trace_log_lvl+0x1d6/0x2ea [ 560.861379] ? amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.867778] ? show_regs.part.0+0x23/0x29 [ 560.872293] ? __die_body.cold+0x8/0xd [ 560.876502] ? die_addr+0x3e/0x60 [ 560.880238] ? exc_general_protection+0x1c5/0x410 [ 560.885532] ? asm_exc_general_protection+0x27/0x30 [ 560.891025] ? amdgpu_device_gpu_recover.cold+0xbf1/0xcf5 [amdgpu] [ 560.898323] amdgpu_ras_do_recovery+0x1b2/0x210 [amdgpu] [ 560.904520] process_one_work+0x228/0x3d0 How: In RAS recovery, mode-1 reset is issued from RAS fatal error handling and expected all the nodes in a hive to be reset. no need to issue another mode-1 during this procedure.
Added	Reference	kernel.org https://git.kernel.org/stable/c/395ca1031acf89d8ecb26127c544a71688d96f35 [No types assigned]
Added	Reference	kernel.org https://git.kernel.org/stable/c/601429cca96b4af3be44172c3b64e4228515dbe1 [No types assigned]

EPSS is a daily estimate of the probability of exploitation activity being observed over the next 30 days. Following chart shows the EPSS score history of the vulnerability.

CVE-2024-35931

drm/amdgpu: Skip do PCI error slot reset during RAS recovery

Description

INFO

May 19, 2024, 11:15 a.m.

Sept. 24, 2025, 6:36 p.m.

No

416baaa9-dc9f-4396-8d5f-8c081fb06d67

Affected Products

CVSS Scores

Solution

Public PoC/Exploit Available at Github

References to Advisories, Solutions, and Tools

CWE - Common Weakness Enumeration

Common Attack Pattern Enumeration and Classification (CAPEC)

ARPSyndicate/cve-scores

Initial Analysis by [email protected]

CVE Modified by af854a3a-2127-422b-91ae-364da2661108

CVE Modified by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

CVE Received by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

Vulnerability Scoring Details

Base CVSS Score: 5.5

Browse by Apps

CVE-2024-35931

drm/amdgpu: Skip do PCI error slot reset during RAS recovery

Description

INFO

May 19, 2024, 11:15 a.m.

Sept. 24, 2025, 6:36 p.m.

No

416baaa9-dc9f-4396-8d5f-8c081fb06d67

Affected Products

CVSS Scores

Solution

Public PoC/Exploit Available at Github

References to Advisories, Solutions, and Tools

CWE - Common Weakness Enumeration

Common Attack Pattern Enumeration and Classification (CAPEC)

ARPSyndicate/cve-scores

Initial Analysis by [email protected]

CVE Modified by af854a3a-2127-422b-91ae-364da2661108

CVE Modified by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

CVE Received by 416baaa9-dc9f-4396-8d5f-8c081fb06d67

Vulnerability Scoring Details

Base CVSS Score: 5.5

Cookie Preferences