CVE-2026-0599: CWE-400 Uncontrolled Resource Consumption in huggingface huggingface/text-generation-inference
CVE-2026-0599 is a high-severity vulnerability in huggingface/text-generation-inference version 3. 3. 6 that allows unauthenticated remote attackers to cause uncontrolled resource consumption. The flaw occurs during input validation in VLM (Vision Language Model) mode, where the system scans for Markdown image links and performs blocking HTTP GET requests to fetch external images. This process reads the entire response body into memory and clones it before decoding, which can lead to network bandwidth saturation, memory inflation, and CPU overutilization. The vulnerability can be triggered even if the request is ultimately rejected for exceeding token limits. Default deployments lacking memory limits and authentication are especially vulnerable, potentially resulting in host crashes. The issue is fixed in version 3. 3. 7.
AI Analysis
Technical Summary
CVE-2026-0599 is a vulnerability classified under CWE-400 (Uncontrolled Resource Consumption) affecting huggingface/text-generation-inference version 3.3.6. The vulnerability arises in the Vision Language Model (VLM) mode during input validation, where the system parses inputs for Markdown image links. Upon detecting such links, the router performs a blocking HTTP GET request to fetch the external image. The entire HTTP response body is read into memory and cloned before decoding, without any imposed size or time limits. This unbounded external image fetching can be exploited by unauthenticated remote attackers to cause resource exhaustion, including saturating network bandwidth, inflating memory usage, and overloading CPU resources. Notably, the vulnerability triggers even if the request is later rejected due to token limits, meaning attackers can cause denial-of-service conditions without successful input processing. The default deployment configuration exacerbates the issue because it lacks memory usage limits and authentication mechanisms, increasing the risk of host machine crashes. Although no known exploits are currently reported in the wild, the vulnerability's nature and ease of exploitation (no authentication or user interaction required) make it a significant risk. The vendor addressed the issue in version 3.3.7 by presumably adding resource constraints and/or authentication requirements.
Potential Impact
For European organizations deploying huggingface/text-generation-inference version 3.3.6 or earlier, this vulnerability poses a significant risk of denial-of-service (DoS) attacks. Attackers can remotely trigger resource exhaustion by submitting inputs containing Markdown image links pointing to large or slow-responding external resources, causing excessive memory and CPU consumption and network bandwidth saturation. This can lead to service outages, degraded performance, and potential crashes of critical AI inference infrastructure. Organizations relying on this software for AI-driven applications, including natural language and vision-language processing, may experience operational disruptions. The lack of authentication in default deployments increases exposure, especially for publicly accessible inference services. Such outages could impact customer-facing services, internal automation, or research environments. Additionally, the resource exhaustion could be leveraged as a smokescreen for other attacks or to degrade trust in AI services. Given the growing adoption of Hugging Face tools in Europe, the threat is material and requires prompt mitigation.
Mitigation Recommendations
European organizations should immediately upgrade huggingface/text-generation-inference to version 3.3.7 or later, where the vulnerability is fixed. Until upgrading, implement strict network egress controls to restrict outbound HTTP requests from the inference service, preventing arbitrary external image fetching. Configure resource limits such as memory caps, CPU quotas, and request timeouts at the container or orchestration level (e.g., Kubernetes resource limits) to contain potential resource exhaustion. Enable authentication and authorization mechanisms to restrict access to the inference API, especially if deployed in public or semi-public environments. Implement input validation or sanitization to detect and block Markdown image links before processing. Monitor system resource usage and network traffic for anomalous spikes indicative of exploitation attempts. Consider deploying web application firewalls (WAFs) or API gateways with rate limiting and request inspection to mitigate abuse. Finally, maintain an incident response plan to quickly isolate and remediate affected systems if exploitation is suspected.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Denmark
CVE-2026-0599: CWE-400 Uncontrolled Resource Consumption in huggingface huggingface/text-generation-inference
Description
CVE-2026-0599 is a high-severity vulnerability in huggingface/text-generation-inference version 3. 3. 6 that allows unauthenticated remote attackers to cause uncontrolled resource consumption. The flaw occurs during input validation in VLM (Vision Language Model) mode, where the system scans for Markdown image links and performs blocking HTTP GET requests to fetch external images. This process reads the entire response body into memory and clones it before decoding, which can lead to network bandwidth saturation, memory inflation, and CPU overutilization. The vulnerability can be triggered even if the request is ultimately rejected for exceeding token limits. Default deployments lacking memory limits and authentication are especially vulnerable, potentially resulting in host crashes. The issue is fixed in version 3. 3. 7.
AI-Powered Analysis
Technical Analysis
CVE-2026-0599 is a vulnerability classified under CWE-400 (Uncontrolled Resource Consumption) affecting huggingface/text-generation-inference version 3.3.6. The vulnerability arises in the Vision Language Model (VLM) mode during input validation, where the system parses inputs for Markdown image links. Upon detecting such links, the router performs a blocking HTTP GET request to fetch the external image. The entire HTTP response body is read into memory and cloned before decoding, without any imposed size or time limits. This unbounded external image fetching can be exploited by unauthenticated remote attackers to cause resource exhaustion, including saturating network bandwidth, inflating memory usage, and overloading CPU resources. Notably, the vulnerability triggers even if the request is later rejected due to token limits, meaning attackers can cause denial-of-service conditions without successful input processing. The default deployment configuration exacerbates the issue because it lacks memory usage limits and authentication mechanisms, increasing the risk of host machine crashes. Although no known exploits are currently reported in the wild, the vulnerability's nature and ease of exploitation (no authentication or user interaction required) make it a significant risk. The vendor addressed the issue in version 3.3.7 by presumably adding resource constraints and/or authentication requirements.
Potential Impact
For European organizations deploying huggingface/text-generation-inference version 3.3.6 or earlier, this vulnerability poses a significant risk of denial-of-service (DoS) attacks. Attackers can remotely trigger resource exhaustion by submitting inputs containing Markdown image links pointing to large or slow-responding external resources, causing excessive memory and CPU consumption and network bandwidth saturation. This can lead to service outages, degraded performance, and potential crashes of critical AI inference infrastructure. Organizations relying on this software for AI-driven applications, including natural language and vision-language processing, may experience operational disruptions. The lack of authentication in default deployments increases exposure, especially for publicly accessible inference services. Such outages could impact customer-facing services, internal automation, or research environments. Additionally, the resource exhaustion could be leveraged as a smokescreen for other attacks or to degrade trust in AI services. Given the growing adoption of Hugging Face tools in Europe, the threat is material and requires prompt mitigation.
Mitigation Recommendations
European organizations should immediately upgrade huggingface/text-generation-inference to version 3.3.7 or later, where the vulnerability is fixed. Until upgrading, implement strict network egress controls to restrict outbound HTTP requests from the inference service, preventing arbitrary external image fetching. Configure resource limits such as memory caps, CPU quotas, and request timeouts at the container or orchestration level (e.g., Kubernetes resource limits) to contain potential resource exhaustion. Enable authentication and authorization mechanisms to restrict access to the inference API, especially if deployed in public or semi-public environments. Implement input validation or sanitization to detect and block Markdown image links before processing. Monitor system resource usage and network traffic for anomalous spikes indicative of exploitation attempts. Consider deploying web application firewalls (WAFs) or API gateways with rate limiting and request inspection to mitigate abuse. Finally, maintain an incident response plan to quickly isolate and remediate affected systems if exploitation is suspected.
Affected Countries
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- @huntr_ai
- Date Reserved
- 2026-01-05T11:35:41.938Z
- Cvss Version
- 3.0
- State
- PUBLISHED
Threat ID: 698083b8f9fa50a62f37059e
Added to database: 2/2/2026, 11:00:08 AM
Last enriched: 2/2/2026, 11:14:26 AM
Last updated: 2/2/2026, 1:06:34 PM
Views: 4
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Related Threats
CVE-2026-1757: Missing Release of Memory after Effective Lifetime in Red Hat Red Hat Enterprise Linux 10
MediumCVE-2025-7105: CWE-400 Uncontrolled Resource Consumption in danny-avila danny-avila/librechat
MediumCVE-2025-6208: CWE-400 Uncontrolled Resource Consumption in run-llama run-llama/llama_index
MediumCVE-2025-10279: CWE-379 Creation of Temporary File in Directory with Insecure Permissions in mlflow mlflow/mlflow
HighCVE-2024-5986: CWE-73 External Control of File Name or Path in h2oai h2oai/h2o-3
CriticalActions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
Need more coverage?
Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.