Skip to main content
Press slash or control plus K to focus the search. Use the arrow keys to navigate results and press enter to open a threat.
Reconnecting to live updates…

CVE-2025-62164: CWE-20: Improper Input Validation in vllm-project vllm

0
High
Published: Fri Nov 21 2025 (11/21/2025, 01:18:38 UTC)
Source: CVE Database V5
Vendor/Project: vllm-project
Product: vllm

Description

vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1.

AI-Powered Analysis

AILast updated: 11/28/2025, 04:41:18 UTC

Technical Analysis

CVE-2025-62164 is a critical vulnerability affecting the vLLM inference and serving engine for large language models, specifically versions from 0.10.2 up to but not including 0.11.1. The root cause is improper input validation when processing user-supplied prompt embeddings through the Completions API endpoint. This endpoint uses torch.load() to deserialize serialized tensor data without sufficient validation. Starting with PyTorch 2.8.0, sparse tensor integrity checks are disabled by default, which allows maliciously crafted sparse tensors to bypass internal bounds checks. When the vulnerable vLLM calls to_dense() on these tensors, an out-of-bounds memory write occurs, leading to memory corruption. This can cause the vLLM process to crash, resulting in denial-of-service (DoS). More critically, the memory corruption can be exploited to achieve remote code execution (RCE) on the server hosting vLLM, potentially allowing attackers to execute arbitrary code with the privileges of the vLLM process. The vulnerability requires network access and low privileges (PR:L), but no user interaction is needed, making it relatively easy to exploit remotely. The issue has been addressed and patched in vLLM version 0.11.1. No known exploits are currently reported in the wild, but the high CVSS score of 8.8 indicates a significant risk. The vulnerability is associated with multiple CWEs: CWE-20 (Improper Input Validation), CWE-123 (Write-what-where Condition), CWE-502 (Deserialization of Untrusted Data), and CWE-787 (Out-of-bounds Write).

Potential Impact

For European organizations, the impact of CVE-2025-62164 is substantial, especially for those deploying vLLM as part of AI inference services or internal machine learning workflows. Exploitation can lead to denial-of-service conditions, disrupting critical AI-driven applications and services, which may affect business continuity and operational efficiency. More severely, successful remote code execution could allow attackers to gain control over the server environment, leading to data breaches, lateral movement within networks, and potential compromise of sensitive intellectual property or personal data. This is particularly concerning for sectors relying heavily on AI, such as finance, healthcare, telecommunications, and government agencies. The vulnerability's network accessibility and lack of user interaction requirement increase the attack surface. Given the growing adoption of AI technologies in Europe, the risk to confidentiality, integrity, and availability of systems is high. Additionally, disruption or compromise of AI services could undermine trust in AI deployments and cause regulatory and compliance issues under GDPR and other data protection laws.

Mitigation Recommendations

To mitigate CVE-2025-62164, organizations should immediately upgrade all vLLM deployments to version 0.11.1 or later, where the vulnerability is patched. Until upgrades are complete, restrict network access to the Completions API endpoint to trusted users and systems only, employing network segmentation and firewall rules. Implement strict input validation and sanitization on all user-supplied tensor data before deserialization, potentially adding custom checks to detect malformed or suspicious tensor structures. Monitor logs and network traffic for anomalous tensor payloads or unusual API usage patterns that could indicate exploitation attempts. Employ runtime protections such as containerization or sandboxing of vLLM processes to limit the impact of potential code execution. Regularly audit and update dependencies, including PyTorch, to ensure security features like sparse tensor integrity checks are enabled. Finally, maintain an incident response plan tailored to AI service compromises, including rapid isolation and forensic analysis capabilities.

Need more detailed analysis?Upgrade to Pro Console

Technical Details

Data Version
5.2
Assigner Short Name
GitHub_M
Date Reserved
2025-10-07T16:12:03.425Z
Cvss Version
3.1
State
PUBLISHED

Threat ID: 691fc3ff70da09562fa7fc8e

Added to database: 11/21/2025, 1:44:31 AM

Last enriched: 11/28/2025, 4:41:18 AM

Last updated: 1/7/2026, 4:48:14 AM

Views: 309

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by
Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.

Actions

PRO

Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.

Please log in to the Console to use AI analysis features.

Need more coverage?

Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.

For incident response and remediation, OffSeq services can help resolve threats faster.

Latest Threats