CVE-2026-42440
Apache OpenNLP: OOM DoS via Unbounded Array Allocation in AbstractModelReader
Description
OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader Versions Affected: before 2.5.9 before 3.0.0-M3 Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins. Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.
INFO
Published Date :
May 4, 2026, 5:16 p.m.
Last Modified :
May 6, 2026, 6:09 p.m.
Remotely Exploit :
Yes !
Source :
[email protected]
CVSS Scores
| Score | Version | Severity | Vector | Exploitability Score | Impact Score | Source |
|---|---|---|---|---|---|---|
| CVSS 3.1 | HIGH | 134c704f-9b21-4f2e-91b3-4a467353bcc0 |
Solution
- Upgrade to Apache OpenNLP version 2.5.9 or later.
- Upgrade to Apache OpenNLP version 3.0.0-M3 or later.
- Validate untrusted model files before loading.
- Set OPENNLP_MAX_ENTRIES system property if needed.
References to Advisories, Solutions, and Tools
Here, you will find a curated list of external links that provide in-depth
information, practical solutions, and valuable tools related to
CVE-2026-42440.
| URL | Resource |
|---|---|
| https://lists.apache.org/thread/s8xlkx1gqbxfsq48py5h6jphjvgqp1jo | Mailing List Vendor Advisory |
| http://www.openwall.com/lists/oss-security/2026/05/01/21 | Mailing List Third Party Advisory |
CWE - Common Weakness Enumeration
While CVE identifies
specific instances of vulnerabilities, CWE categorizes the common flaws or
weaknesses that can lead to vulnerabilities. CVE-2026-42440 is
associated with the following CWEs:
Common Attack Pattern Enumeration and Classification (CAPEC)
Common Attack Pattern Enumeration and Classification
(CAPEC)
stores attack patterns, which are descriptions of the common attributes and
approaches employed by adversaries to exploit the CVE-2026-42440
weaknesses.
We scan GitHub repositories to detect new proof-of-concept exploits. Following list is a collection of public exploits and proof-of-concepts, which have been published on GitHub (sorted by the most recently updated).
Results are limited to the first 15 repositories due to potential performance issues.
The following list is the news that have been mention
CVE-2026-42440 vulnerability anywhere in the article.
The following table lists the changes that have been made to the
CVE-2026-42440 vulnerability over time.
Vulnerability history details can be useful for understanding the evolution of a vulnerability, and for identifying the most recent changes that may impact the vulnerability's severity, exploitability, or other characteristics.
-
Initial Analysis by [email protected]
May. 06, 2026
Action Type Old Value New Value Added CPE Configuration OR *cpe:2.3:a:apache:opennlp:*:*:*:*:*:*:*:* versions up to (excluding) 2.5.9 *cpe:2.3:a:apache:opennlp:3.0.0:m1:*:*:*:*:*:* *cpe:2.3:a:apache:opennlp:3.0.0:m2:*:*:*:*:*:* Added Reference Type CVE: http://www.openwall.com/lists/oss-security/2026/05/01/21 Types: Mailing List, Third Party Advisory Added Reference Type Apache Software Foundation: https://lists.apache.org/thread/s8xlkx1gqbxfsq48py5h6jphjvgqp1jo Types: Mailing List, Vendor Advisory -
CVE Modified by 134c704f-9b21-4f2e-91b3-4a467353bcc0
May. 05, 2026
Action Type Old Value New Value Added CVSS V3.1 AV:N/AC:L/PR:N/UI:N/S:U/C:N/I:N/A:H -
CVE Modified by af854a3a-2127-422b-91ae-364da2661108
May. 04, 2026
Action Type Old Value New Value Added Reference http://www.openwall.com/lists/oss-security/2026/05/01/21 -
New CVE Received by [email protected]
May. 04, 2026
Action Type Old Value New Value Added Description OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader Versions Affected: before 2.5.9 before 3.0.0-M3 Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins. Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks. Added CWE CWE-789 Added Reference https://lists.apache.org/thread/s8xlkx1gqbxfsq48py5h6jphjvgqp1jo