Skip to main content
Press slash or control plus K to focus the search. Use the arrow keys to navigate results and press enter to open a threat.
Reconnecting to live updates…

CVE-2026-42440: CWE-789: Memory Allocation with Excessive Size Value in Apache Software Foundation Apache OpenNLP

0
Unknown
VulnerabilityCVE-2026-42440cvecve-2026-42440cwe-789
Published: Mon May 04 2026 (05/04/2026, 16:40:32 UTC)
Source: CVE Database V5
Vendor/Project: Apache Software Foundation
Product: Apache OpenNLP

Description

OOM Denial of Service via Unbounded Array Allocation in Apache OpenNLP AbstractModelReader  Versions Affected:  before 2.5.9 before 3.0.0-M3  Description: The AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() each read a 32-bit signed integer count field from a binary model stream and pass that value directly to an array allocation (new String[numOutcomes], new int[numOCTypes][], new String[NUM_PREDS]) without validating that the value is non-negative or within a reasonable bound. The count is therefore fully attacker-controlled when the model file originates from an untrusted source. A crafted .bin model file in which any of these count fields is set to Integer.MAX_VALUE (or any value large enough to exhaust the available heap) triggers an OutOfMemoryError at the array allocation itself, before the corresponding label or pattern data is consumed from the stream. The error occurs very early in deserialization: for a GIS model, getOutcomes() is reached after only the model-type string, the correction constant, and the correction parameter have been read; so the attacker pays no meaningful size cost to weaponize a payload, and a single small file can crash a JVM that loads it. Any code path that deserializes a .bin model is affected, including direct use of GenericModelReader and any higher-level component that delegates to it during model load. The practical impact is denial of service against processes that load model files from untrusted or semi-trusted origins.   Mitigation: * 2.x users should upgrade to 2.5.9. * 3.x users should upgrade to 3.0.0-M3. Note: The fix introduces an upper bound on each of the three count fields, checked before array allocation; counts that are negative or exceed the bound cause an IllegalArgumentException to be thrown and the read to fail fast with no large allocation. The default bound is 10,000,000, which is well above the entry counts of legitimate OpenNLP models but far below any value that would threaten heap exhaustion. Deployments that legitimately need to load models with more entries than the default can raise the limit at JVM startup by setting the OPENNLP_MAX_ENTRIES system property to the desired positive integer (e.g. -DOPENNLP_MAX_ENTRIES=50000000); invalid or non-positive values fall back to the default. Users who cannot upgrade immediately should treat all .bin model files as untrusted input unless their provenance is verified, and should avoid loading models supplied by end users or fetched from third-party repositories without integrity checks.

AI-Powered Analysis

Machine-generated threat intelligence

AILast updated: 05/04/2026, 17:22:04 UTC

Technical Analysis

Apache OpenNLP versions prior to 2.5.9 and 3.0.0-M3 contain a vulnerability (CWE-789) where the AbstractModelReader methods getOutcomes(), getOutcomePatterns(), and getPredicates() read 32-bit signed integer counts from a binary model stream and use these values directly to allocate arrays without validating that the counts are non-negative or within reasonable bounds. An attacker can craft a .bin model file with excessively large count values (e.g., Integer.MAX_VALUE) to trigger an OutOfMemoryError during array allocation, causing a denial of service by crashing the JVM process loading the model. The error occurs early in deserialization, requiring only a small crafted file to exploit. The vulnerability affects all code paths that deserialize .bin models, including GenericModelReader and higher-level components. The vendor fixed the issue by enforcing an upper bound (default 10,000,000) on these counts, throwing an IllegalArgumentException if exceeded, preventing large allocations. Users can adjust this bound via the OPENNLP_MAX_ENTRIES system property if needed. The fix is available in Apache OpenNLP 2.5.9 and 3.0.0-M3.

Potential Impact

The vulnerability allows an attacker to cause a denial of service by crashing the JVM process that loads a maliciously crafted .bin model file with excessively large count fields. This results in an OutOfMemoryError during array allocation, preventing the affected application from functioning properly. There is no indication of code execution or data corruption beyond denial of service. The impact is limited to processes that load untrusted or semi-trusted model files.

Mitigation Recommendations

A fix is available in Apache OpenNLP versions 2.5.9 and 3.0.0-M3, which introduce upper bounds on array allocation counts to prevent excessive memory use. Users should upgrade to these versions to remediate the vulnerability. For users unable to upgrade immediately, it is recommended to treat all .bin model files as untrusted input unless their provenance is verified and to avoid loading models supplied by end users or from third-party repositories without integrity checks. The fix enforces a default maximum count of 10,000,000 entries, which can be adjusted via the OPENNLP_MAX_ENTRIES system property if legitimately larger models are required.

Pro Console: star threats, build custom feeds, automate alerts via Slack, email & webhooks.Upgrade to Pro

Technical Details

Data Version
5.2
Assigner Short Name
apache
Date Reserved
2026-04-27T12:43:14.347Z
Cvss Version
null
State
PUBLISHED
Remediation Level
null

Threat ID: 69f8d219cbff5d86103970b3

Added to database: 5/4/2026, 5:06:33 PM

Last enriched: 5/4/2026, 5:22:04 PM

Last updated: 5/4/2026, 6:14:33 PM

Views: 3

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by
Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.

Actions

PRO

Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.

Please log in to the Console to use AI analysis features.

Need more coverage?

Upgrade to Pro Console for AI refresh and higher limits.

For incident response and remediation, OffSeq services can help resolve threats faster.

Latest Threats

Breach by OffSeqOFFSEQFRIENDS — 25% OFF

Check if your credentials are on the dark web

Instant breach scanning across billions of leaked records. Free tier available.

Scan now
OffSeq TrainingCredly Certified

Lead Pen Test Professional

Technical5-day eLearningPECB Accredited
View courses