CVE-2024-5206

4.7

MEDIUM CVSS 3.1

Sensitive Data Leakage in sklearn.feature_extraction.text.TfidfVectorizer in scikit-learn/scikit-learn

Overview

Description

A sensitive data leakage vulnerability was identified in scikit-learn's TfidfVectorizer, specifically in versions up to and including 1.4.1.post1, which was fixed in version 1.5.0. The vulnerability arises from the unexpected storage of all tokens present in the training data within the `stop_words_` attribute, rather than only storing the subset of tokens required for the TF-IDF technique to function. This behavior leads to the potential leakage of sensitive information, as the `stop_words_` attribute could contain tokens that were meant to be discarded and not stored, such as passwords or keys. The impact of this vulnerability varies based on the nature of the data being processed by the vectorizer.

Details

INFO

Published Date :

June 6, 2024, 7:16 p.m.

Last Modified :

June 17, 2026, 8:15 a.m.

Remotely Exploit :

No

Source :

[email protected]

Impact

Affected Products

The following products are affected by CVE-2024-5206 vulnerability. Even if cvefeed.io is aware of the exact versions of the products that are affected, the information is not represented in the table below.

ID	Vendor	Product	Action
1	Scikit-learn	scikit-learn

: Total Affected Vendor : 1 | Products : 1

Scoring

CVSS Scores

The Common Vulnerability Scoring System is a standardized framework for assessing the severity of vulnerabilities in software and systems. We collect and displays CVSS scores from various sources for each CVE.

Version	Severity	Source
CVSS		134c704f-9b21-4f2e-91b3-4a467353bcc0
CVSS 3.0	MEDIUM	[email protected]
CVSS 3.1	MEDIUM	[email protected]

Solution

This information is provided by the 3rd party feeds.

Update the affected python3-scikit-learn package.

Public PoC/Exploit Available at Github

CVE-2024-5206 has a 6 public PoC/Exploit available at Github. Go to the Public Exploits tab to see the list.

References

References to Advisories, Solutions, and Tools

Here, you will find a curated list of external links that provide in-depth information, practical solutions, and valuable tools related to CVE-2024-5206.

URL	Resource
https://github.com/scikit-learn/scikit-learn/commit/70ca21f106b603b611da73012c9ade7cd8e438b8	Patch
https://huntr.com/bounties/14bc0917-a85b-4106-a170-d09d5191517c	Third Party Advisory
https://github.com/scikit-learn/scikit-learn/commit/70ca21f106b603b611da73012c9ade7cd8e438b8	Patch
https://huntr.com/bounties/14bc0917-a85b-4106-a170-d09d5191517c	Third Party Advisory

CWE - Common Weakness Enumeration

While CVE identifies specific instances of vulnerabilities, CWE categorizes the common flaws or weaknesses that can lead to vulnerabilities. CVE-2024-5206 is associated with the following CWEs:

CWE-921: Storage of Sensitive Data in a Mechanism without Access Control

CWE-922: Insecure Storage of Sensitive Information

Common Attack Pattern Enumeration and Classification (CAPEC)

Common Attack Pattern Enumeration and Classification (CAPEC) stores attack patterns, which are descriptions of the common attributes and approaches employed by adversaries to exploit the CVE-2024-5206 weaknesses.

We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).

hamizan-azman/llm-supply-chain-poc

LLM supply chain security PoC shooting ranges

Python Dockerfile Shell C

Updated: 4 months, 1 week ago

0 stars 0 fork 0 watcher

Born at : Feb. 25, 2026, 7:06 a.m. This repo has been linked 5 different CVEs too.

nrjain1997/python-security-test

None

Python

Updated: 7 months ago

0 stars 0 fork 0 watcher

Born at : Dec. 12, 2025, 11:52 a.m. This repo has been linked 5 different CVEs too.

tim-gowan/python-security-benchmark

Code to test SAST and SCA Reachability

Python

Updated: 7 months ago

0 stars 0 fork 0 watcher

Born at : Dec. 9, 2025, 8:29 p.m. This repo has been linked 5 different CVEs too.

Jasonyu77/ai-vuln-analysis

Analyzing Typical Vulnerabilities in the AI Ecosystem

Updated: 10 months, 3 weeks ago

0 stars 0 fork 0 watcher

Born at : Aug. 19, 2025, 9:38 a.m. This repo has been linked 4 different CVEs too.

RedF0xSec/Weather-Forecasting-App

Weather forecasting web application focusing on containerization with Docker and Kubernetes. The system leverages machine learning for weather predictions while ensuring scalability, high availability, and security through advanced deployment strategies.

Dockerfile Python Jupyter Notebook

Updated: 1 year, 4 months ago

0 stars 0 fork 0 watcher

Born at : March 8, 2025, 4:11 p.m. This repo has been linked 3 different CVEs too.

vahinitech/imu2text

IMU2Text: A hybrid CNN+GNN pipeline for handwriting recognition and trajectory prediction using IMU data with state-of-the-art accuracy (99.74%).

deeplearning handwriting-recognition imu sensor-simulation cnn cnn-classification gnn graph-neural-networks imu-data meachine-learning multi-task-learning

Python TeX MATLAB

Updated: 3 days, 14 hours ago

5 stars 2 fork 2 watcher

Born at : Nov. 19, 2024, 6:38 a.m. This repo has been linked 2 different CVEs too.

Results are limited to the first 15 repositories due to potential performance issues.

The following list is the news that have been mention CVE-2024-5206 vulnerability anywhere in the article.

Results are limited to the first 20 news articles due to potential performance issues.

The following table lists the changes that have been made to the CVE-2024-5206 vulnerability over time.

Vulnerability history details can be useful for understanding the evolution of a vulnerability, and for identifying the most recent changes that may impact the vulnerability's severity, exploitability, or other characteristics.

CVE Modified by [email protected]

Jun. 17, 2026

Action	Type	Old Value	New Value
Added	Affected		[{'vendor': 'scikit-learn', 'product': 'scikit-learn/scikit-learn', 'versions': [{'status': 'affected', 'version': 'unspecified', 'lessThan': '1.5.0', 'versionType': 'custom'}]}]

CVE Modified by 134c704f-9b21-4f2e-91b3-4a467353bcc0

Jun. 17, 2026

Action	Type	Old Value	New Value
Added	Affected		[{'cpes': ['cpe:2.3:a:scikit-learn:scikit-learn::::::::'], 'vendor': 'scikit-learn', 'product': 'scikit-learn', 'versions': [{'status': 'affected', 'version': '0', 'lessThan': '1.5.0', 'versionType': 'custom'}], 'defaultStatus': 'unknown'}]
Added	SSVC		{'id': 'CVE-2024-5206', 'role': 'CISA Coordinator', 'options': [{'exploitation': 'poc'}, {'automatable': 'no'}, {'technicalImpact': 'partial'}], 'version': '2.0.3', 'timestamp': '2024-06-07T15:11:02.549686Z'}

CVE Modified by af854a3a-2127-422b-91ae-364da2661108

Nov. 21, 2024

Action	Type	Old Value	New Value
Added	Reference		https://github.com/scikit-learn/scikit-learn/commit/70ca21f106b603b611da73012c9ade7cd8e438b8
Added	Reference		https://huntr.com/bounties/14bc0917-a85b-4106-a170-d09d5191517c

Initial Analysis by [email protected]

Oct. 24, 2024

Action	Type	Old Value	New Value
Added	CVSS V3.1		NIST AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N
Changed	Reference Type	https://github.com/scikit-learn/scikit-learn/commit/70ca21f106b603b611da73012c9ade7cd8e438b8 No Types Assigned	https://github.com/scikit-learn/scikit-learn/commit/70ca21f106b603b611da73012c9ade7cd8e438b8 Patch
Changed	Reference Type	https://huntr.com/bounties/14bc0917-a85b-4106-a170-d09d5191517c No Types Assigned	https://huntr.com/bounties/14bc0917-a85b-4106-a170-d09d5191517c Third Party Advisory
Added	CWE		NIST CWE-922
Added	CPE Configuration		OR cpe:2.3:a:scikit-learn:scikit-learn::::::python:: versions up to (excluding) 1.5.0

CVE Modified by [email protected]

Jun. 17, 2024

Action	Type	Old Value	New Value
Removed	CVSS V3	huntr.dev AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N
Added	CVSS V3		huntr.dev AV:L/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N

CVE Received by [email protected]

Jun. 06, 2024

Action	Type	New Value
Added	Description	A sensitive data leakage vulnerability was identified in scikit-learn's TfidfVectorizer, specifically in versions up to and including 1.4.1.post1, which was fixed in version 1.5.0. The vulnerability arises from the unexpected storage of all tokens present in the training data within the `stop_words_` attribute, rather than only storing the subset of tokens required for the TF-IDF technique to function. This behavior leads to the potential leakage of sensitive information, as the `stop_words_` attribute could contain tokens that were meant to be discarded and not stored, such as passwords or keys. The impact of this vulnerability varies based on the nature of the data being processed by the vectorizer.
Added	Reference	huntr.dev https://huntr.com/bounties/14bc0917-a85b-4106-a170-d09d5191517c [No types assigned]
Added	Reference	huntr.dev https://github.com/scikit-learn/scikit-learn/commit/70ca21f106b603b611da73012c9ade7cd8e438b8 [No types assigned]
Added	CWE	huntr.dev CWE-921
Added	CVSS V3	huntr.dev AV:N/AC:H/PR:L/UI:N/S:U/C:H/I:N/A:N

EPSS is a daily estimate of the probability of exploitation activity being observed over the next 30 days. Following chart shows the EPSS score history of the vulnerability.

CVE-2024-5206

Sensitive Data Leakage in sklearn.feature_extraction.text.TfidfVectorizer in scikit-learn/scikit-learn

Description

INFO

June 6, 2024, 7:16 p.m.

June 17, 2026, 8:15 a.m.

No

[email protected]