Skip to main content
Press slash or control plus K to focus the search. Use the arrow keys to navigate results and press enter to open a threat.
Reconnecting to live updates…

CVE-2025-62164: CWE-20: Improper Input Validation in vllm-project vllm

0
High
Published: Fri Nov 21 2025 (11/21/2025, 01:18:38 UTC)
Source: CVE Database V5
Vendor/Project: vllm-project
Product: vllm

Description

vLLM is an inference and serving engine for large language models (LLMs). From versions 0.10.2 to before 0.11.1, a memory corruption vulnerability could lead to a crash (denial-of-service) and potentially remote code execution (RCE), exists in the Completions API endpoint. When processing user-supplied prompt embeddings, the endpoint loads serialized tensors using torch.load() without sufficient validation. Due to a change introduced in PyTorch 2.8.0, sparse tensor integrity checks are disabled by default. As a result, maliciously crafted tensors can bypass internal bounds checks and trigger an out-of-bounds memory write during the call to to_dense(). This memory corruption can crash vLLM and potentially lead to code execution on the server hosting vLLM. This issue has been patched in version 0.11.1.

AI-Powered Analysis

AILast updated: 11/21/2025, 02:00:14 UTC

Technical Analysis

CVE-2025-62164 affects the vLLM project, an inference and serving engine for large language models, specifically versions from 0.10.2 up to but not including 0.11.1. The vulnerability stems from improper input validation and unsafe deserialization of user-supplied prompt embeddings in the Completions API endpoint. The endpoint uses torch.load() to deserialize tensors without sufficient validation. Starting with PyTorch 2.8.0, sparse tensor integrity checks are disabled by default, which allows maliciously crafted sparse tensors to bypass internal bounds checks. When the vulnerable code calls to_dense() on these tensors, an out-of-bounds memory write can occur, leading to memory corruption. This can cause the vLLM process to crash, resulting in denial-of-service, or potentially enable remote code execution on the host server. The vulnerability is exploitable remotely over the network with low privileges and does not require user interaction, increasing its risk profile. The issue has been addressed in vLLM version 0.11.1 by restoring proper validation and handling of sparse tensors during deserialization. No known exploits have been reported in the wild yet, but the high CVSS score of 8.8 reflects the critical impact and ease of exploitation. The vulnerability involves multiple CWEs: CWE-20 (Improper Input Validation), CWE-123 (Write-What-Where Condition), CWE-502 (Deserialization of Untrusted Data), and CWE-787 (Out-of-bounds Write).

Potential Impact

For European organizations, this vulnerability poses significant risks to the confidentiality, integrity, and availability of AI inference services relying on vLLM. Exploitation could allow attackers to execute arbitrary code on servers hosting vLLM, potentially leading to data breaches, unauthorized access to sensitive AI models or data, and disruption of AI-powered services. Organizations using vLLM in cloud or on-premises environments for critical AI workloads may face service outages or compromise of intellectual property. The vulnerability's network accessibility and lack of user interaction requirements increase the attack surface, especially for publicly exposed API endpoints. Given the growing adoption of AI technologies across Europe in sectors such as finance, healthcare, and manufacturing, the impact could extend to critical infrastructure and sensitive data processing. The absence of known exploits currently provides a window for proactive mitigation, but the high severity demands urgent attention.

Mitigation Recommendations

1. Immediately upgrade all vLLM deployments to version 0.11.1 or later, where the vulnerability is patched. 2. Restrict access to the Completions API endpoint by implementing strong authentication and network segmentation, limiting API calls to trusted users and internal networks only. 3. Employ input validation and sanitization at the application layer to detect and reject malformed or suspicious tensor data before deserialization. 4. Monitor API usage logs for anomalous or unexpected requests that could indicate exploitation attempts. 5. If upgrading is not immediately feasible, consider disabling or restricting the use of the Completions API endpoint to reduce exposure. 6. Review and update incident response plans to include detection and mitigation steps for potential exploitation of deserialization vulnerabilities. 7. Coordinate with cloud providers or third-party vendors to ensure patched versions are deployed in managed environments. 8. Stay informed on threat intelligence updates regarding any emerging exploits targeting this vulnerability.

Need more detailed analysis?Get Pro

Technical Details

Data Version
5.2
Assigner Short Name
GitHub_M
Date Reserved
2025-10-07T16:12:03.425Z
Cvss Version
3.1
State
PUBLISHED

Threat ID: 691fc3ff70da09562fa7fc8e

Added to database: 11/21/2025, 1:44:31 AM

Last enriched: 11/21/2025, 2:00:14 AM

Last updated: 11/21/2025, 2:35:48 PM

Views: 13

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by
Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.

Actions

PRO

Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.

Please log in to the Console to use AI analysis features.

Need enhanced features?

Contact root@offseq.com for Pro access with improved analysis and higher rate limits.

Latest Threats