CVE-2025-48944: CWE-20: Improper Input Validation in vllm-project vllm
vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue.
AI Analysis
Technical Summary
CVE-2025-48944 is a medium-severity vulnerability identified in the vLLM project, specifically affecting versions from 0.8.0 up to but excluding 0.9.0. vLLM is an inference and serving engine designed for large language models (LLMs), commonly used to provide AI-driven chat completions via an OpenAPI endpoint (/v1/chat/completions). The vulnerability arises from improper input validation (CWE-20) in the handling of the "pattern" and "type" fields when the tools functionality is invoked. These fields are not properly validated before being compiled or parsed, allowing an attacker to submit unexpected or malformed inputs. This leads to a crash of the inference worker process handling the request. The crash results in a denial of service (DoS) condition, as the worker remains down until manually restarted. The vulnerability does not impact confidentiality or integrity but severely affects availability. Exploitation requires network access (AV:N), low attack complexity (AC:L), and privileges (PR:L) but no user interaction (UI:N). The scope is unchanged (S:U), and the CVSS v3.1 base score is 6.5, reflecting a medium severity. No known exploits are currently reported in the wild, and the issue is resolved in version 0.9.0 of vLLM. This vulnerability highlights the risks of insufficient input validation in AI model serving infrastructure, which can be leveraged to disrupt AI services by causing worker crashes through crafted API requests.
Potential Impact
For European organizations deploying vLLM versions 0.8.0 to <0.9.0, this vulnerability poses a significant availability risk to AI inference services. Organizations relying on AI-driven chat completions or other LLM-based tools for customer support, automation, or decision-making could experience service outages due to worker crashes triggered by malicious or malformed API requests. This disruption could degrade user experience, reduce operational efficiency, and potentially impact business continuity. While the vulnerability does not expose sensitive data or allow unauthorized code execution, the denial of service could be exploited by attackers to cause repeated outages or degrade trust in AI services. Given the growing adoption of AI technologies in Europe, especially in sectors like finance, healthcare, and public services, the impact could be notable if the vulnerable versions are in use. Additionally, the requirement for low privileges means that insider threats or compromised internal systems could exploit this vulnerability to disrupt AI services.
Mitigation Recommendations
European organizations should prioritize upgrading vLLM to version 0.9.0 or later, where this input validation flaw is fixed. Until the upgrade is applied, organizations should implement strict input validation and sanitization at the API gateway or proxy level to filter out malformed or unexpected inputs in the "pattern" and "type" fields. Rate limiting and anomaly detection on the /v1/chat/completions endpoint can help detect and block suspicious requests that may trigger crashes. Monitoring the health of inference workers and automating restart procedures can reduce downtime impact. Additionally, restricting access to the API endpoint to trusted networks or authenticated users with minimal privileges can reduce exploitation risk. Organizations should also review internal policies to limit who can invoke the tools functionality and audit logs for unusual request patterns. Finally, security teams should stay alert for any emerging exploit reports and apply patches promptly.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Denmark, Ireland
CVE-2025-48944: CWE-20: Improper Input Validation in vllm-project vllm
Description
vLLM is an inference and serving engine for large language models (LLMs). In version 0.8.0 up to but excluding 0.9.0, the vLLM backend used with the /v1/chat/completions OpenAPI endpoint fails to validate unexpected or malformed input in the "pattern" and "type" fields when the tools functionality is invoked. These inputs are not validated before being compiled or parsed, causing a crash of the inference worker with a single request. The worker will remain down until it is restarted. Version 0.9.0 fixes the issue.
AI-Powered Analysis
Technical Analysis
CVE-2025-48944 is a medium-severity vulnerability identified in the vLLM project, specifically affecting versions from 0.8.0 up to but excluding 0.9.0. vLLM is an inference and serving engine designed for large language models (LLMs), commonly used to provide AI-driven chat completions via an OpenAPI endpoint (/v1/chat/completions). The vulnerability arises from improper input validation (CWE-20) in the handling of the "pattern" and "type" fields when the tools functionality is invoked. These fields are not properly validated before being compiled or parsed, allowing an attacker to submit unexpected or malformed inputs. This leads to a crash of the inference worker process handling the request. The crash results in a denial of service (DoS) condition, as the worker remains down until manually restarted. The vulnerability does not impact confidentiality or integrity but severely affects availability. Exploitation requires network access (AV:N), low attack complexity (AC:L), and privileges (PR:L) but no user interaction (UI:N). The scope is unchanged (S:U), and the CVSS v3.1 base score is 6.5, reflecting a medium severity. No known exploits are currently reported in the wild, and the issue is resolved in version 0.9.0 of vLLM. This vulnerability highlights the risks of insufficient input validation in AI model serving infrastructure, which can be leveraged to disrupt AI services by causing worker crashes through crafted API requests.
Potential Impact
For European organizations deploying vLLM versions 0.8.0 to <0.9.0, this vulnerability poses a significant availability risk to AI inference services. Organizations relying on AI-driven chat completions or other LLM-based tools for customer support, automation, or decision-making could experience service outages due to worker crashes triggered by malicious or malformed API requests. This disruption could degrade user experience, reduce operational efficiency, and potentially impact business continuity. While the vulnerability does not expose sensitive data or allow unauthorized code execution, the denial of service could be exploited by attackers to cause repeated outages or degrade trust in AI services. Given the growing adoption of AI technologies in Europe, especially in sectors like finance, healthcare, and public services, the impact could be notable if the vulnerable versions are in use. Additionally, the requirement for low privileges means that insider threats or compromised internal systems could exploit this vulnerability to disrupt AI services.
Mitigation Recommendations
European organizations should prioritize upgrading vLLM to version 0.9.0 or later, where this input validation flaw is fixed. Until the upgrade is applied, organizations should implement strict input validation and sanitization at the API gateway or proxy level to filter out malformed or unexpected inputs in the "pattern" and "type" fields. Rate limiting and anomaly detection on the /v1/chat/completions endpoint can help detect and block suspicious requests that may trigger crashes. Monitoring the health of inference workers and automating restart procedures can reduce downtime impact. Additionally, restricting access to the API endpoint to trusted networks or authenticated users with minimal privileges can reduce exploitation risk. Organizations should also review internal policies to limit who can invoke the tools functionality and audit logs for unusual request patterns. Finally, security teams should stay alert for any emerging exploit reports and apply patches promptly.
Affected Countries
For access to advanced analysis and higher rate limits, contact root@offseq.com
Technical Details
- Data Version
- 5.1
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2025-05-28T18:49:07.582Z
- Cvss Version
- 3.1
- State
- PUBLISHED
Threat ID: 6839fc40182aa0cae2bc1f28
Added to database: 5/30/2025, 6:43:12 PM
Last enriched: 7/8/2025, 2:25:56 PM
Last updated: 8/11/2025, 8:47:03 PM
Views: 19
Related Threats
CVE-2025-53948: CWE-415 Double Free in Santesoft Sante PACS Server
HighCVE-2025-52584: CWE-122 Heap-based Buffer Overflow in Ashlar-Vellum Cobalt
HighCVE-2025-46269: CWE-122 Heap-based Buffer Overflow in Ashlar-Vellum Cobalt
HighCVE-2025-54862: CWE-79 Improper Neutralization of Input During Web Page Generation (XSS or 'Cross-site Scripting') in Santesoft Sante PACS Server
MediumCVE-2025-54759: CWE-79 Improper Neutralization of Input During Web Page Generation (XSS or 'Cross-site Scripting') in Santesoft Sante PACS Server
MediumActions
Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.
Need enhanced features?
Contact root@offseq.com for Pro access with improved analysis and higher rate limits.