Skip to main content
Press slash or control plus K to focus the search. Use the arrow keys to navigate results and press enter to open a threat.
Reconnecting to live updates…

CVE-2026-40682: CWE-611 Improper Restriction of XML External Entity Reference in Apache Software Foundation Apache OpenNLP

0
Unknown
VulnerabilityCVE-2026-40682cvecve-2026-40682cwe-611
Published: Mon May 04 2026 (05/04/2026, 16:55:55 UTC)
Source: CVE Database V5
Vendor/Project: Apache Software Foundation
Product: Apache OpenNLP

Description

XML External Entity (XXE) via Unsanitized Dictionary Parsing in Apache OpenNLP DictionaryEntryPersistor Versions Affected: before 2.5.9, before 3.0.0-M3 Description: The DictionaryEntryPersistor class initializes a static SAXParserFactory at class-load time without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing. When create(InputStream, EntryInserter) is invoked, the only feature set on the XMLReader is namespace support — external entity resolution and DOCTYPE declarations remain fully enabled. An attacker who can supply a crafted dictionary file (e.g., a stop-word list or domain dictionary) containing a malicious DOCTYPE declaration can trigger local file disclosure via file:// entity references or server-side request forgery via http:// entity references during SAX parsing, before the application processes a single dictionary entry. This is inconsistent with the project's own XmlUtil.createSaxParser() helper, which correctly sets FEATURE_SECURE_PROCESSING and disallow-doctype-decl and is used by all other XML parsing paths in the codebase. The public Dictionary(InputStream) constructor delegates directly to this method and is the documented API for loading user-supplied dictionaries, making untrusted input a realistic scenario. Mitigation: 2.x users should upgrade to 2.5.9. 3.x users should upgrade to 3.0.0-M3. Users who cannot upgrade immediately should ensure that all dictionary files are sourced from trusted origins and should consider wrapping the Dictionary(InputStream) constructor with input validation that rejects any XML containing a DOCTYPE declaration before it reaches the parser.

AI-Powered Analysis

Machine-generated threat intelligence

AILast updated: 05/04/2026, 17:22:40 UTC

Technical Analysis

Apache OpenNLP versions before 2.5.9 and 3.0.0-M3 contain an XXE vulnerability in the DictionaryEntryPersistor class. This class initializes a SAXParserFactory without enabling FEATURE_SECURE_PROCESSING or disabling DTD processing, leaving external entity resolution and DOCTYPE declarations enabled. When the create(InputStream, EntryInserter) method is called, an attacker supplying a crafted dictionary XML file with malicious DOCTYPE declarations can exploit this to perform local file disclosure or server-side request forgery during SAX parsing. This vulnerability affects the public Dictionary(InputStream) constructor, which is the documented API for loading user dictionaries, making untrusted input a realistic attack vector. The vulnerability is inconsistent with other XML parsing in the project, which correctly disables these features. The vendor recommends upgrading to versions 2.5.9 or 3.0.0-M3 to remediate the issue.

Potential Impact

An attacker able to supply a crafted dictionary XML file to the vulnerable Apache OpenNLP versions can exploit this XXE vulnerability to read local files on the server or cause server-side request forgery. This can lead to unauthorized disclosure of sensitive information or interaction with internal network resources. The impact is limited to scenarios where untrusted dictionary files are processed by the vulnerable API.

Mitigation Recommendations

A fixed version is available: upgrade to Apache OpenNLP 2.5.9 or 3.0.0-M3. If upgrading immediately is not possible, ensure all dictionary files are sourced from trusted origins. Additionally, implement input validation to reject any XML containing DOCTYPE declarations before it is parsed by the Dictionary(InputStream) constructor. These mitigations reduce the risk of exploitation until an upgrade can be performed.

Pro Console: star threats, build custom feeds, automate alerts via Slack, email & webhooks.Upgrade to Pro

Technical Details

Data Version
5.2
Assigner Short Name
apache
Date Reserved
2026-04-14T17:21:09.189Z
Cvss Version
null
State
PUBLISHED
Remediation Level
null

Threat ID: 69f8d216cbff5d8610397041

Added to database: 5/4/2026, 5:06:30 PM

Last enriched: 5/4/2026, 5:22:40 PM

Last updated: 5/5/2026, 5:58:01 AM

Views: 3

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by
Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.

Actions

PRO

Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.

Please log in to the Console to use AI analysis features.

Need more coverage?

Upgrade to Pro Console for AI refresh and higher limits.

For incident response and remediation, OffSeq services can help resolve threats faster.

Latest Threats

Breach by OffSeqOFFSEQFRIENDS — 25% OFF

Check if your credentials are on the dark web

Instant breach scanning across billions of leaked records. Free tier available.

Scan now
OffSeq TrainingCredly Certified

Lead Pen Test Professional

Technical5-day eLearningPECB Accredited
View courses