CVE-2025-48942: CWE-248: Uncaught Exception in vllm-project vllm
vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
AI Analysis
Technical Summary
CVE-2025-48942 is a medium severity vulnerability affecting the vLLM inference and serving engine for large language models (LLMs), specifically versions 0.8.0 up to but excluding 0.9.0. The vulnerability arises when the /v1/completions API endpoint receives an invalid JSON schema as a Guided Param. This malformed input triggers an uncaught exception, causing the vLLM server to crash and become unavailable. The issue is classified under CWE-248, which relates to uncaught exceptions leading to application crashes or denial of service. This vulnerability is similar to CVE-2025-48943, which involves a regex-based input causing a similar crash, but CVE-48942 specifically involves JSON schema inputs. The vulnerability does not impact confidentiality or integrity but results in a denial of service (availability impact). The CVSS 3.1 base score is 6.5, reflecting a network attack vector with low attack complexity, requiring low privileges but no user interaction. The scope is unchanged, meaning the vulnerability affects only the vulnerable component without impacting other components. No known exploits are currently reported in the wild. The issue is fixed in version 0.9.0 of vLLM. Since vLLM is a specialized engine used for serving large language models, this vulnerability could disrupt AI services relying on this software if exploited by sending crafted API requests with invalid JSON schemas, leading to service outages.
Potential Impact
For European organizations deploying vLLM versions 0.8.0 to before 0.9.0, this vulnerability poses a risk primarily to service availability. Organizations using vLLM to serve AI or language model inference could experience denial of service if attackers send malformed requests to the /v1/completions API, causing server crashes. This could disrupt AI-driven applications, customer-facing chatbots, or internal automation relying on LLM inference, potentially impacting business operations and user experience. Given the growing adoption of AI technologies in sectors such as finance, healthcare, and public services across Europe, service interruptions could have operational and reputational consequences. However, since the vulnerability does not allow data breach or code execution, the confidentiality and integrity of data remain unaffected. The requirement for low privileges but no user interaction means internal threat actors or attackers with some access to the API endpoint could exploit this. The absence of known exploits in the wild reduces immediate risk but patching is recommended to prevent future exploitation.
Mitigation Recommendations
European organizations should upgrade vLLM to version 0.9.0 or later, where this vulnerability is fixed. Until upgrading is possible, organizations can implement strict input validation and filtering at the API gateway or web application firewall (WAF) level to detect and block malformed JSON schema inputs targeting the /v1/completions endpoint. Rate limiting and anomaly detection on API requests can help identify and mitigate potential exploitation attempts. Monitoring server logs for repeated crashes or malformed requests can provide early warning signs. Restricting access to the API endpoint to trusted networks or authenticated users can reduce exposure. Additionally, organizations should conduct regular vulnerability scans and penetration tests focusing on AI inference services to detect similar issues proactively. Maintaining an incident response plan for service outages related to AI inference engines is also advisable.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Denmark
CVE-2025-48942: CWE-248: Uncaught Exception in vllm-project vllm
Description
vLLM is an inference and serving engine for large language models (LLMs). In versions 0.8.0 up to but excluding 0.9.0, hitting the /v1/completions API with a invalid json_schema as a Guided Param kills the vllm server. This vulnerability is similar GHSA-9hcf-v7m4-6m2j/CVE-2025-48943, but for regex instead of a JSON schema. Version 0.9.0 fixes the issue.
AI-Powered Analysis
Technical Analysis
CVE-2025-48942 is a medium severity vulnerability affecting the vLLM inference and serving engine for large language models (LLMs), specifically versions 0.8.0 up to but excluding 0.9.0. The vulnerability arises when the /v1/completions API endpoint receives an invalid JSON schema as a Guided Param. This malformed input triggers an uncaught exception, causing the vLLM server to crash and become unavailable. The issue is classified under CWE-248, which relates to uncaught exceptions leading to application crashes or denial of service. This vulnerability is similar to CVE-2025-48943, which involves a regex-based input causing a similar crash, but CVE-48942 specifically involves JSON schema inputs. The vulnerability does not impact confidentiality or integrity but results in a denial of service (availability impact). The CVSS 3.1 base score is 6.5, reflecting a network attack vector with low attack complexity, requiring low privileges but no user interaction. The scope is unchanged, meaning the vulnerability affects only the vulnerable component without impacting other components. No known exploits are currently reported in the wild. The issue is fixed in version 0.9.0 of vLLM. Since vLLM is a specialized engine used for serving large language models, this vulnerability could disrupt AI services relying on this software if exploited by sending crafted API requests with invalid JSON schemas, leading to service outages.
Potential Impact
For European organizations deploying vLLM versions 0.8.0 to before 0.9.0, this vulnerability poses a risk primarily to service availability. Organizations using vLLM to serve AI or language model inference could experience denial of service if attackers send malformed requests to the /v1/completions API, causing server crashes. This could disrupt AI-driven applications, customer-facing chatbots, or internal automation relying on LLM inference, potentially impacting business operations and user experience. Given the growing adoption of AI technologies in sectors such as finance, healthcare, and public services across Europe, service interruptions could have operational and reputational consequences. However, since the vulnerability does not allow data breach or code execution, the confidentiality and integrity of data remain unaffected. The requirement for low privileges but no user interaction means internal threat actors or attackers with some access to the API endpoint could exploit this. The absence of known exploits in the wild reduces immediate risk but patching is recommended to prevent future exploitation.
Mitigation Recommendations
European organizations should upgrade vLLM to version 0.9.0 or later, where this vulnerability is fixed. Until upgrading is possible, organizations can implement strict input validation and filtering at the API gateway or web application firewall (WAF) level to detect and block malformed JSON schema inputs targeting the /v1/completions endpoint. Rate limiting and anomaly detection on API requests can help identify and mitigate potential exploitation attempts. Monitoring server logs for repeated crashes or malformed requests can provide early warning signs. Restricting access to the API endpoint to trusted networks or authenticated users can reduce exposure. Additionally, organizations should conduct regular vulnerability scans and penetration tests focusing on AI inference services to detect similar issues proactively. Maintaining an incident response plan for service outages related to AI inference engines is also advisable.
Affected Countries
For access to advanced analysis and higher rate limits, contact root@offseq.com
Technical Details
- Data Version
- 5.1
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2025-05-28T18:49:07.581Z
- Cvss Version
- 3.1
- State
- PUBLISHED
Threat ID: 6839fc40182aa0cae2bc1f24
Added to database: 5/30/2025, 6:43:12 PM
Last enriched: 7/8/2025, 1:56:13 PM
Last updated: 8/11/2025, 9:35:45 PM
Views: 14
Related Threats
CVE-2025-53948: CWE-415 Double Free in Santesoft Sante PACS Server
HighCVE-2025-52584: CWE-122 Heap-based Buffer Overflow in Ashlar-Vellum Cobalt
HighCVE-2025-46269: CWE-122 Heap-based Buffer Overflow in Ashlar-Vellum Cobalt
HighCVE-2025-54862: CWE-79 Improper Neutralization of Input During Web Page Generation (XSS or 'Cross-site Scripting') in Santesoft Sante PACS Server
MediumCVE-2025-54759: CWE-79 Improper Neutralization of Input During Web Page Generation (XSS or 'Cross-site Scripting') in Santesoft Sante PACS Server
MediumActions
Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.
Need enhanced features?
Contact root@offseq.com for Pro access with improved analysis and higher rate limits.