Skip to main content

CVE-2025-32444: CWE-502: Deserialization of Untrusted Data in vllm-project vllm

Critical
VulnerabilityCVE-2025-32444cvecve-2025-32444cwe-502
Published: Wed Apr 30 2025 (04/30/2025, 00:25:00 UTC)
Source: CVE
Vendor/Project: vllm-project
Product: vllm

Description

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. Versions starting from 0.6.5 and prior to 0.8.5, having vLLM integration with mooncake, are vulnerable to remote code execution due to using pickle based serialization over unsecured ZeroMQ sockets. The vulnerable sockets were set to listen on all network interfaces, increasing the likelihood that an attacker is able to reach the vulnerable ZeroMQ sockets to carry out an attack. vLLM instances that do not make use of the mooncake integration are not vulnerable. This issue has been patched in version 0.8.5.

AI-Powered Analysis

AILast updated: 06/25/2025, 05:51:12 UTC

Technical Analysis

CVE-2025-32444 is a critical remote code execution (RCE) vulnerability affecting the vLLM project, specifically versions from 0.6.5 up to but not including 0.8.5, when integrated with the mooncake component. vLLM is a high-throughput, memory-efficient inference and serving engine designed for large language models (LLMs). The vulnerability arises from the use of Python's pickle serialization over unsecured ZeroMQ sockets that are configured to listen on all network interfaces. Pickle is inherently unsafe when deserializing untrusted data because it allows arbitrary code execution during the deserialization process. In this case, the ZeroMQ sockets expose the deserialization endpoint broadly on the network, increasing the attack surface and enabling remote attackers to send maliciously crafted pickle payloads. This can lead to full system compromise without requiring authentication or user interaction. The vulnerability is classified under CWE-502 (Deserialization of Untrusted Data). The issue has been patched in vLLM version 0.8.5, which presumably replaces or secures the serialization mechanism and/or restricts socket exposure. Notably, vLLM instances that do not use the mooncake integration are not vulnerable, indicating the flaw is specific to that integration layer. Although no known exploits are currently reported in the wild, the CVSS v3.1 base score is 10.0, reflecting the highest severity due to network attack vector, no required privileges or user interaction, and complete impact on confidentiality, integrity, and availability with a scope change. This vulnerability poses a significant risk to any organization deploying vulnerable versions of vLLM with mooncake integration, especially in environments where the ZeroMQ sockets are exposed to untrusted networks or the internet.

Potential Impact

For European organizations, the impact of this vulnerability can be severe, particularly for entities leveraging vLLM for AI inference workloads in production or research environments. Successful exploitation would allow attackers to execute arbitrary code remotely, potentially leading to full system takeover, data theft, manipulation of AI model outputs, or disruption of AI services. This could compromise sensitive intellectual property, customer data, or critical AI-driven business processes. Given the criticality and ease of exploitation, organizations in sectors such as finance, healthcare, telecommunications, and government—where AI inference engines may be integrated into operational workflows—face heightened risks. Additionally, the exposure of ZeroMQ sockets on all network interfaces increases the likelihood of attacks originating from both internal and external threat actors. The vulnerability could also be leveraged as a foothold for lateral movement within networks or to deploy ransomware or other malware payloads. The lack of required authentication and user interaction further exacerbates the threat, making automated exploitation feasible. The impact extends beyond confidentiality and integrity to availability, as compromised systems may be taken offline or manipulated to produce incorrect AI outputs, undermining trust in AI services.

Mitigation Recommendations

To mitigate this vulnerability, European organizations should: 1) Immediately upgrade all affected vLLM instances with mooncake integration to version 0.8.5 or later, where the vulnerability is patched. 2) If upgrading is not immediately feasible, restrict network exposure of ZeroMQ sockets by configuring them to bind only to localhost or trusted internal interfaces, preventing remote access from untrusted networks. 3) Implement network-level controls such as firewall rules or segmentation to limit access to the ZeroMQ ports only to authorized systems. 4) Audit existing deployments to identify any instances running vulnerable versions with mooncake integration, using software inventory and network scanning tools. 5) Consider disabling or removing the mooncake integration if it is not essential to operations, thereby eliminating the attack vector. 6) Monitor network traffic for unusual or unexpected ZeroMQ communication patterns that could indicate exploitation attempts. 7) Employ runtime application self-protection (RASP) or endpoint detection and response (EDR) solutions to detect anomalous process behavior indicative of code execution attacks. 8) Educate development and operations teams on the risks of insecure deserialization and the importance of secure serialization methods, especially when exposing services over the network. These steps go beyond generic patching advice by emphasizing network exposure controls and operational monitoring tailored to the nature of the vulnerability.

Need more detailed analysis?Get Pro

Technical Details

Data Version
5.1
Assigner Short Name
GitHub_M
Date Reserved
2025-04-08T10:54:58.369Z
Cisa Enriched
true
Cvss Version
3.1
State
PUBLISHED

Threat ID: 682d983bc4522896dcbee2fc

Added to database: 5/21/2025, 9:09:15 AM

Last enriched: 6/25/2025, 5:51:12 AM

Last updated: 8/17/2025, 11:14:34 AM

Views: 14

Actions

PRO

Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.

Please log in to the Console to use AI analysis features.

Need enhanced features?

Contact root@offseq.com for Pro access with improved analysis and higher rate limits.

Latest Threats