CVE-2026-0599: CWE-400 Uncontrolled Resource Consumption in huggingface huggingface/text-generation-inference
A vulnerability in huggingface/text-generation-inference version 3.3.6 allows unauthenticated remote attackers to exploit unbounded external image fetching during input validation in VLM mode. The issue arises when the router scans inputs for Markdown image links and performs a blocking HTTP GET request, reading the entire response body into memory and cloning it before decoding. This behavior can lead to resource exhaustion, including network bandwidth saturation, memory inflation, and CPU overutilization. The vulnerability is triggered even if the request is later rejected for exceeding token limits. The default deployment configuration, which lacks memory usage limits and authentication, exacerbates the impact, potentially crashing the host machine. The issue is resolved in version 3.3.7.
AI Analysis
Technical Summary
CVE-2026-0599 is a vulnerability classified under CWE-400 (Uncontrolled Resource Consumption) affecting huggingface/text-generation-inference version 3.3.6. The vulnerability arises in the Vision Language Model (VLM) mode during input validation, where the system parses inputs for Markdown image links. Upon detecting such links, the router performs a blocking HTTP GET request to fetch the external image. The entire HTTP response body is read into memory and cloned before decoding, without any imposed size or time limits. This unbounded external image fetching can be exploited by unauthenticated remote attackers to cause resource exhaustion, including saturating network bandwidth, inflating memory usage, and overloading CPU resources. Notably, the vulnerability triggers even if the request is later rejected due to token limits, meaning attackers can cause denial-of-service conditions without successful input processing. The default deployment configuration exacerbates the issue because it lacks memory usage limits and authentication mechanisms, increasing the risk of host machine crashes. Although no known exploits are currently reported in the wild, the vulnerability's nature and ease of exploitation (no authentication or user interaction required) make it a significant risk. The vendor addressed the issue in version 3.3.7 by presumably adding resource constraints and/or authentication requirements.
Potential Impact
For European organizations deploying huggingface/text-generation-inference version 3.3.6 or earlier, this vulnerability poses a significant risk of denial-of-service (DoS) attacks. Attackers can remotely trigger resource exhaustion by submitting inputs containing Markdown image links pointing to large or slow-responding external resources, causing excessive memory and CPU consumption and network bandwidth saturation. This can lead to service outages, degraded performance, and potential crashes of critical AI inference infrastructure. Organizations relying on this software for AI-driven applications, including natural language and vision-language processing, may experience operational disruptions. The lack of authentication in default deployments increases exposure, especially for publicly accessible inference services. Such outages could impact customer-facing services, internal automation, or research environments. Additionally, the resource exhaustion could be leveraged as a smokescreen for other attacks or to degrade trust in AI services. Given the growing adoption of Hugging Face tools in Europe, the threat is material and requires prompt mitigation.
Mitigation Recommendations
European organizations should immediately upgrade huggingface/text-generation-inference to version 3.3.7 or later, where the vulnerability is fixed. Until upgrading, implement strict network egress controls to restrict outbound HTTP requests from the inference service, preventing arbitrary external image fetching. Configure resource limits such as memory caps, CPU quotas, and request timeouts at the container or orchestration level (e.g., Kubernetes resource limits) to contain potential resource exhaustion. Enable authentication and authorization mechanisms to restrict access to the inference API, especially if deployed in public or semi-public environments. Implement input validation or sanitization to detect and block Markdown image links before processing. Monitor system resource usage and network traffic for anomalous spikes indicative of exploitation attempts. Consider deploying web application firewalls (WAFs) or API gateways with rate limiting and request inspection to mitigate abuse. Finally, maintain an incident response plan to quickly isolate and remediate affected systems if exploitation is suspected.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Denmark
CVE-2026-0599: CWE-400 Uncontrolled Resource Consumption in huggingface huggingface/text-generation-inference
Description
A vulnerability in huggingface/text-generation-inference version 3.3.6 allows unauthenticated remote attackers to exploit unbounded external image fetching during input validation in VLM mode. The issue arises when the router scans inputs for Markdown image links and performs a blocking HTTP GET request, reading the entire response body into memory and cloning it before decoding. This behavior can lead to resource exhaustion, including network bandwidth saturation, memory inflation, and CPU overutilization. The vulnerability is triggered even if the request is later rejected for exceeding token limits. The default deployment configuration, which lacks memory usage limits and authentication, exacerbates the impact, potentially crashing the host machine. The issue is resolved in version 3.3.7.
AI-Powered Analysis
Technical Analysis
CVE-2026-0599 is a vulnerability classified under CWE-400 (Uncontrolled Resource Consumption) affecting huggingface/text-generation-inference version 3.3.6. The vulnerability arises in the Vision Language Model (VLM) mode during input validation, where the system parses inputs for Markdown image links. Upon detecting such links, the router performs a blocking HTTP GET request to fetch the external image. The entire HTTP response body is read into memory and cloned before decoding, without any imposed size or time limits. This unbounded external image fetching can be exploited by unauthenticated remote attackers to cause resource exhaustion, including saturating network bandwidth, inflating memory usage, and overloading CPU resources. Notably, the vulnerability triggers even if the request is later rejected due to token limits, meaning attackers can cause denial-of-service conditions without successful input processing. The default deployment configuration exacerbates the issue because it lacks memory usage limits and authentication mechanisms, increasing the risk of host machine crashes. Although no known exploits are currently reported in the wild, the vulnerability's nature and ease of exploitation (no authentication or user interaction required) make it a significant risk. The vendor addressed the issue in version 3.3.7 by presumably adding resource constraints and/or authentication requirements.
Potential Impact
For European organizations deploying huggingface/text-generation-inference version 3.3.6 or earlier, this vulnerability poses a significant risk of denial-of-service (DoS) attacks. Attackers can remotely trigger resource exhaustion by submitting inputs containing Markdown image links pointing to large or slow-responding external resources, causing excessive memory and CPU consumption and network bandwidth saturation. This can lead to service outages, degraded performance, and potential crashes of critical AI inference infrastructure. Organizations relying on this software for AI-driven applications, including natural language and vision-language processing, may experience operational disruptions. The lack of authentication in default deployments increases exposure, especially for publicly accessible inference services. Such outages could impact customer-facing services, internal automation, or research environments. Additionally, the resource exhaustion could be leveraged as a smokescreen for other attacks or to degrade trust in AI services. Given the growing adoption of Hugging Face tools in Europe, the threat is material and requires prompt mitigation.
Mitigation Recommendations
European organizations should immediately upgrade huggingface/text-generation-inference to version 3.3.7 or later, where the vulnerability is fixed. Until upgrading, implement strict network egress controls to restrict outbound HTTP requests from the inference service, preventing arbitrary external image fetching. Configure resource limits such as memory caps, CPU quotas, and request timeouts at the container or orchestration level (e.g., Kubernetes resource limits) to contain potential resource exhaustion. Enable authentication and authorization mechanisms to restrict access to the inference API, especially if deployed in public or semi-public environments. Implement input validation or sanitization to detect and block Markdown image links before processing. Monitor system resource usage and network traffic for anomalous spikes indicative of exploitation attempts. Consider deploying web application firewalls (WAFs) or API gateways with rate limiting and request inspection to mitigate abuse. Finally, maintain an incident response plan to quickly isolate and remediate affected systems if exploitation is suspected.
Affected Countries
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- @huntr_ai
- Date Reserved
- 2026-01-05T11:35:41.938Z
- Cvss Version
- 3.0
- State
- PUBLISHED
Threat ID: 698083b8f9fa50a62f37059e
Added to database: 2/2/2026, 11:00:08 AM
Last enriched: 2/2/2026, 11:14:26 AM
Last updated: 3/19/2026, 3:32:51 PM
Views: 134
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Actions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
Need more coverage?
Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.
Latest Threats
Check if your credentials are on the dark web
Instant breach scanning across billions of leaked records. Free tier available.