Skip to main content

CVE-2025-48956: CWE-400: Uncontrolled Resource Consumption in vllm-project vllm

High
VulnerabilityCVE-2025-48956cvecve-2025-48956cwe-400
Published: Thu Aug 21 2025 (08/21/2025, 14:41:03 UTC)
Source: CVE Database V5
Vendor/Project: vllm-project
Product: vllm

Description

vLLM is an inference and serving engine for large language models (LLMs). From 0.1.0 to before 0.10.1.1, a Denial of Service (DoS) vulnerability can be triggered by sending a single HTTP GET request with an extremely large header to an HTTP endpoint. This results in server memory exhaustion, potentially leading to a crash or unresponsiveness. The attack does not require authentication, making it exploitable by any remote user. This vulnerability is fixed in 0.10.1.1.

AI-Powered Analysis

AILast updated: 08/21/2025, 15:17:52 UTC

Technical Analysis

CVE-2025-48956 is a high-severity Denial of Service (DoS) vulnerability affecting the vLLM inference and serving engine for large language models (LLMs), specifically versions from 0.1.0 up to but not including 0.10.1.1. The vulnerability arises from uncontrolled resource consumption (CWE-400) triggered by sending a single HTTP GET request containing an extremely large header to an HTTP endpoint exposed by the vLLM server. This malformed request causes the server to exhaust its memory resources, leading to potential crashes or unresponsiveness. The attack vector is network-based (AV:N), requires no authentication (PR:N), and no user interaction (UI:N), making it trivially exploitable by any remote attacker. The vulnerability impacts availability only, with no direct confidentiality or integrity compromise. The issue has been addressed in version 0.10.1.1 of vLLM. No known exploits are currently reported in the wild. The CVSS v3.1 base score is 7.5, reflecting the ease of exploitation and significant impact on availability. Given that vLLM is used to serve large language models, often in AI-driven applications and services, this vulnerability could disrupt critical AI inference workloads by causing denial of service, impacting service continuity and reliability.

Potential Impact

For European organizations deploying vLLM for AI inference and serving, this vulnerability poses a significant risk to service availability. Organizations relying on vLLM for customer-facing AI applications, internal automation, or research could experience service outages or degraded performance if targeted by this DoS attack. The lack of authentication requirement means that any external attacker can attempt exploitation, increasing the attack surface. Disruptions could affect sectors such as finance, healthcare, telecommunications, and government services where AI-driven applications are increasingly integrated. Additionally, the potential for memory exhaustion could lead to cascading failures in multi-tenant or cloud environments, amplifying the impact. The unavailability of AI services could delay critical decision-making processes or customer interactions, resulting in operational and reputational damage. However, since no data confidentiality or integrity is compromised, the primary concern remains service disruption.

Mitigation Recommendations

European organizations should immediately verify their vLLM deployment versions and upgrade to v0.10.1.1 or later to remediate this vulnerability. In environments where immediate patching is not feasible, implementing network-level protections is critical. This includes configuring web application firewalls (WAFs) or reverse proxies to limit the size of HTTP headers accepted by the vLLM endpoints, effectively blocking requests with abnormally large headers. Rate limiting and anomaly detection mechanisms should be employed to identify and block suspicious traffic patterns indicative of DoS attempts. Additionally, isolating vLLM services within segmented network zones and restricting access to trusted IP ranges can reduce exposure. Monitoring system memory usage and setting up alerts for unusual resource consumption can provide early warning signs of exploitation attempts. Finally, organizations should review incident response plans to include scenarios involving AI service unavailability.

Need more detailed analysis?Get Pro

Technical Details

Data Version
5.1
Assigner Short Name
GitHub_M
Date Reserved
2025-05-28T18:49:07.585Z
Cvss Version
3.1
State
PUBLISHED

Threat ID: 68a73519ad5a09ad0011fe46

Added to database: 8/21/2025, 3:02:49 PM

Last enriched: 8/21/2025, 3:17:52 PM

Last updated: 8/21/2025, 3:17:52 PM

Views: 2

Actions

PRO

Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.

Please log in to the Console to use AI analysis features.

Need enhanced features?

Contact root@offseq.com for Pro access with improved analysis and higher rate limits.

Latest Threats