CVE-2026-22773: CWE-770: Allocation of Resources Without Limits or Throttling in vllm-project vllm
vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This causes a tensor dimension mismatch that results in an unhandled runtime error, leading to complete server termination. This issue has been patched in version 0.12.0.
AI Analysis
Technical Summary
CVE-2026-22773 is a vulnerability classified under CWE-770 (Allocation of Resources Without Limits or Throttling) affecting the vLLM inference engine for large language models. Specifically, versions from 0.6.4 up to but not including 0.12.0 are vulnerable. The flaw arises when the engine processes multimodal models that incorporate the Idefics3 vision model implementation. An attacker can send a specially crafted 1x1 pixel image that triggers a tensor dimension mismatch during inference. This mismatch causes an unhandled runtime error, which crashes the vLLM server process entirely, resulting in a denial of service. The vulnerability requires no user interaction and can be exploited remotely over the network with low privileges, as indicated by the CVSS vector (AV:N/AC:L/PR:L/UI:N). The impact is limited to availability, with no direct compromise of confidentiality or integrity. No known exploits are currently in the wild, but the vulnerability poses a risk to any deployment of vulnerable vLLM versions serving multimodal vision models. The issue was publicly disclosed on January 10, 2026, and patched in vLLM 0.12.0. The root cause is insufficient input validation and lack of throttling or resource limits when processing image inputs, allowing crafted inputs to cause resource exhaustion or runtime errors.
Potential Impact
For European organizations, the primary impact is denial of service on AI inference services using vulnerable vLLM versions with multimodal vision models. This can disrupt business operations relying on AI-driven image analysis or multimodal data processing, potentially affecting sectors such as healthcare, automotive, manufacturing, and finance where AI inference is integrated. Service outages could lead to operational delays, loss of customer trust, and financial losses. Since the vulnerability does not affect data confidentiality or integrity, the risk of data breach is low. However, repeated exploitation could degrade system availability and increase operational costs due to downtime and recovery efforts. Organizations deploying vLLM in cloud or on-premises environments must consider the risk of remote crashes and the potential cascading effects on dependent systems and services.
Mitigation Recommendations
1. Upgrade all vLLM deployments to version 0.12.0 or later, where the vulnerability is patched. 2. Implement strict input validation on all image inputs, rejecting malformed or suspiciously small images such as 1x1 pixels before processing. 3. Deploy rate limiting and throttling mechanisms on inference request endpoints to prevent resource exhaustion from repeated crafted inputs. 4. Monitor inference engine logs and alerts for abnormal crashes or runtime errors indicative of exploitation attempts. 5. Isolate AI inference services in segmented network zones to limit exposure and impact of potential DoS attacks. 6. Conduct regular security assessments and penetration testing focused on AI model serving infrastructure. 7. Maintain an incident response plan that includes recovery procedures for AI service disruptions. 8. Engage with the vLLM project community for updates and security advisories.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland
CVE-2026-22773: CWE-770: Allocation of Resources Without Limits or Throttling in vllm-project vllm
Description
vLLM is an inference and serving engine for large language models (LLMs). In versions from 0.6.4 to before 0.12.0, users can crash the vLLM engine serving multimodal models that use the Idefics3 vision model implementation by sending a specially crafted 1x1 pixel image. This causes a tensor dimension mismatch that results in an unhandled runtime error, leading to complete server termination. This issue has been patched in version 0.12.0.
AI-Powered Analysis
Technical Analysis
CVE-2026-22773 is a vulnerability classified under CWE-770 (Allocation of Resources Without Limits or Throttling) affecting the vLLM inference engine for large language models. Specifically, versions from 0.6.4 up to but not including 0.12.0 are vulnerable. The flaw arises when the engine processes multimodal models that incorporate the Idefics3 vision model implementation. An attacker can send a specially crafted 1x1 pixel image that triggers a tensor dimension mismatch during inference. This mismatch causes an unhandled runtime error, which crashes the vLLM server process entirely, resulting in a denial of service. The vulnerability requires no user interaction and can be exploited remotely over the network with low privileges, as indicated by the CVSS vector (AV:N/AC:L/PR:L/UI:N). The impact is limited to availability, with no direct compromise of confidentiality or integrity. No known exploits are currently in the wild, but the vulnerability poses a risk to any deployment of vulnerable vLLM versions serving multimodal vision models. The issue was publicly disclosed on January 10, 2026, and patched in vLLM 0.12.0. The root cause is insufficient input validation and lack of throttling or resource limits when processing image inputs, allowing crafted inputs to cause resource exhaustion or runtime errors.
Potential Impact
For European organizations, the primary impact is denial of service on AI inference services using vulnerable vLLM versions with multimodal vision models. This can disrupt business operations relying on AI-driven image analysis or multimodal data processing, potentially affecting sectors such as healthcare, automotive, manufacturing, and finance where AI inference is integrated. Service outages could lead to operational delays, loss of customer trust, and financial losses. Since the vulnerability does not affect data confidentiality or integrity, the risk of data breach is low. However, repeated exploitation could degrade system availability and increase operational costs due to downtime and recovery efforts. Organizations deploying vLLM in cloud or on-premises environments must consider the risk of remote crashes and the potential cascading effects on dependent systems and services.
Mitigation Recommendations
1. Upgrade all vLLM deployments to version 0.12.0 or later, where the vulnerability is patched. 2. Implement strict input validation on all image inputs, rejecting malformed or suspiciously small images such as 1x1 pixels before processing. 3. Deploy rate limiting and throttling mechanisms on inference request endpoints to prevent resource exhaustion from repeated crafted inputs. 4. Monitor inference engine logs and alerts for abnormal crashes or runtime errors indicative of exploitation attempts. 5. Isolate AI inference services in segmented network zones to limit exposure and impact of potential DoS attacks. 6. Conduct regular security assessments and penetration testing focused on AI model serving infrastructure. 7. Maintain an incident response plan that includes recovery procedures for AI service disruptions. 8. Engage with the vLLM project community for updates and security advisories.
Affected Countries
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2026-01-09T18:27:19.387Z
- Cvss Version
- 3.1
- State
- PUBLISHED
Threat ID: 6961f7c0c540fa4b5456583e
Added to database: 1/10/2026, 6:54:56 AM
Last enriched: 1/17/2026, 8:00:50 AM
Last updated: 2/7/2026, 2:44:55 PM
Views: 291
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Related Threats
CVE-2026-2087: SQL Injection in SourceCodester Online Class Record System
MediumCVE-2026-2086: Buffer Overflow in UTT HiPER 810G
HighOrganizations Urged to Replace Discontinued Edge Devices
MediumCVE-2026-2085: Command Injection in D-Link DWR-M921
HighCVE-2026-2084: OS Command Injection in D-Link DIR-823X
HighActions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
External Links
Need more coverage?
Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.