Skip to main content
Press slash or control plus K to focus the search. Use the arrow keys to navigate results and press enter to open a threat.
Reconnecting to live updates…

CVE-2026-0599: CWE-400 Uncontrolled Resource Consumption in huggingface huggingface/text-generation-inference

0
High
VulnerabilityCVE-2026-0599cvecve-2026-0599cwe-400
Published: Mon Feb 02 2026 (02/02/2026, 10:36:25 UTC)
Source: CVE Database V5
Vendor/Project: huggingface
Product: huggingface/text-generation-inference

Description

CVE-2026-0599 is a high-severity vulnerability in huggingface/text-generation-inference version 3. 3. 6 that allows unauthenticated remote attackers to cause uncontrolled resource consumption. The flaw occurs during input validation in VLM (Vision Language Model) mode, where the system scans for Markdown image links and performs blocking HTTP GET requests to fetch external images. This process reads the entire response body into memory and clones it before decoding, which can lead to network bandwidth saturation, memory inflation, and CPU overutilization. The vulnerability can be triggered even if the request is ultimately rejected for exceeding token limits. Default deployments lacking memory limits and authentication are especially vulnerable, potentially resulting in host crashes. The issue is fixed in version 3. 3. 7.

AI-Powered Analysis

AILast updated: 02/02/2026, 11:14:26 UTC

Technical Analysis

CVE-2026-0599 is a vulnerability classified under CWE-400 (Uncontrolled Resource Consumption) affecting huggingface/text-generation-inference version 3.3.6. The vulnerability arises in the Vision Language Model (VLM) mode during input validation, where the system parses inputs for Markdown image links. Upon detecting such links, the router performs a blocking HTTP GET request to fetch the external image. The entire HTTP response body is read into memory and cloned before decoding, without any imposed size or time limits. This unbounded external image fetching can be exploited by unauthenticated remote attackers to cause resource exhaustion, including saturating network bandwidth, inflating memory usage, and overloading CPU resources. Notably, the vulnerability triggers even if the request is later rejected due to token limits, meaning attackers can cause denial-of-service conditions without successful input processing. The default deployment configuration exacerbates the issue because it lacks memory usage limits and authentication mechanisms, increasing the risk of host machine crashes. Although no known exploits are currently reported in the wild, the vulnerability's nature and ease of exploitation (no authentication or user interaction required) make it a significant risk. The vendor addressed the issue in version 3.3.7 by presumably adding resource constraints and/or authentication requirements.

Potential Impact

For European organizations deploying huggingface/text-generation-inference version 3.3.6 or earlier, this vulnerability poses a significant risk of denial-of-service (DoS) attacks. Attackers can remotely trigger resource exhaustion by submitting inputs containing Markdown image links pointing to large or slow-responding external resources, causing excessive memory and CPU consumption and network bandwidth saturation. This can lead to service outages, degraded performance, and potential crashes of critical AI inference infrastructure. Organizations relying on this software for AI-driven applications, including natural language and vision-language processing, may experience operational disruptions. The lack of authentication in default deployments increases exposure, especially for publicly accessible inference services. Such outages could impact customer-facing services, internal automation, or research environments. Additionally, the resource exhaustion could be leveraged as a smokescreen for other attacks or to degrade trust in AI services. Given the growing adoption of Hugging Face tools in Europe, the threat is material and requires prompt mitigation.

Mitigation Recommendations

European organizations should immediately upgrade huggingface/text-generation-inference to version 3.3.7 or later, where the vulnerability is fixed. Until upgrading, implement strict network egress controls to restrict outbound HTTP requests from the inference service, preventing arbitrary external image fetching. Configure resource limits such as memory caps, CPU quotas, and request timeouts at the container or orchestration level (e.g., Kubernetes resource limits) to contain potential resource exhaustion. Enable authentication and authorization mechanisms to restrict access to the inference API, especially if deployed in public or semi-public environments. Implement input validation or sanitization to detect and block Markdown image links before processing. Monitor system resource usage and network traffic for anomalous spikes indicative of exploitation attempts. Consider deploying web application firewalls (WAFs) or API gateways with rate limiting and request inspection to mitigate abuse. Finally, maintain an incident response plan to quickly isolate and remediate affected systems if exploitation is suspected.

Need more detailed analysis?Upgrade to Pro Console

Technical Details

Data Version
5.2
Assigner Short Name
@huntr_ai
Date Reserved
2026-01-05T11:35:41.938Z
Cvss Version
3.0
State
PUBLISHED

Threat ID: 698083b8f9fa50a62f37059e

Added to database: 2/2/2026, 11:00:08 AM

Last enriched: 2/2/2026, 11:14:26 AM

Last updated: 2/2/2026, 1:06:34 PM

Views: 4

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by
Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.

Actions

PRO

Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.

Please log in to the Console to use AI analysis features.

Need more coverage?

Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.

For incident response and remediation, OffSeq services can help resolve threats faster.

Latest Threats