CVE-2026-34756: CWE-770: Allocation of Resources Without Limits or Throttling in vllm-project vllm
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.
AI Analysis
Technical Summary
The vLLM inference and serving engine for large language models has a denial of service vulnerability (CWE-770) in versions 0.1.0 through before 0.19.0. The vulnerability is due to missing upper bound validation on the n parameter in ChatCompletionRequest and CompletionRequest Pydantic models. An attacker can exploit this by sending a single HTTP request with an astronomically large n value, causing the server to allocate millions of request object copies in memory before processing, which blocks the asyncio event loop and results in out-of-memory crashes. This vulnerability is resolved in vLLM version 0.19.0.
Potential Impact
An unauthenticated attacker can cause a denial of service by crashing the vLLM server through excessive memory consumption and blocking the event loop. This results in service unavailability. There is no impact on confidentiality or integrity according to the CVSS vector. The CVSS score is 6.5 (medium severity). No known exploits in the wild have been reported.
Mitigation Recommendations
Upgrade vLLM to version 0.19.0 or later, where this vulnerability is fixed. Since the vulnerability is resolved in this version, applying this official fix is the recommended remediation. No other mitigations are indicated.
CVE-2026-34756: CWE-770: Allocation of Resources Without Limits or Throttling in vllm-project vllm
Description
vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.19.0, a Denial of Service vulnerability exists in the vLLM OpenAI-compatible API server. Due to the lack of an upper bound validation on the n parameter in the ChatCompletionRequest and CompletionRequest Pydantic models, an unauthenticated attacker can send a single HTTP request with an astronomically large n value. This completely blocks the Python asyncio event loop and causes immediate Out-Of-Memory crashes by allocating millions of request object copies in the heap before the request even reaches the scheduling queue. This vulnerability is fixed in 0.19.0.
AI-Powered Analysis
Machine-generated threat intelligence
Technical Analysis
The vLLM inference and serving engine for large language models has a denial of service vulnerability (CWE-770) in versions 0.1.0 through before 0.19.0. The vulnerability is due to missing upper bound validation on the n parameter in ChatCompletionRequest and CompletionRequest Pydantic models. An attacker can exploit this by sending a single HTTP request with an astronomically large n value, causing the server to allocate millions of request object copies in memory before processing, which blocks the asyncio event loop and results in out-of-memory crashes. This vulnerability is resolved in vLLM version 0.19.0.
Potential Impact
An unauthenticated attacker can cause a denial of service by crashing the vLLM server through excessive memory consumption and blocking the event loop. This results in service unavailability. There is no impact on confidentiality or integrity according to the CVSS vector. The CVSS score is 6.5 (medium severity). No known exploits in the wild have been reported.
Mitigation Recommendations
Upgrade vLLM to version 0.19.0 or later, where this vulnerability is fixed. Since the vulnerability is resolved in this version, applying this official fix is the recommended remediation. No other mitigations are indicated.
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2026-03-30T19:17:10.225Z
- Cvss Version
- 3.1
- State
- PUBLISHED
- Remediation Level
- null
Threat ID: 69d49831aaed68159aca0f94
Added to database: 4/7/2026, 5:37:53 AM
Last enriched: 4/7/2026, 5:39:22 AM
Last updated: 4/7/2026, 6:38:27 AM
Views: 4
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Actions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
Need more coverage?
Upgrade to Pro Console for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.
Latest Threats
Check if your credentials are on the dark web
Instant breach scanning across billions of leaked records. Free tier available.