CVE-2025-47277: CWE-502: Deserialization of Untrusted Data in vllm-project vllm
vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configurations are affected. vLLM supports the use of the `PyNcclPipe` class to establish a peer-to-peer communication domain for data transmission between distributed nodes. The GPU-side KV-Cache transmission is implemented through the `PyNcclCommunicator` class, while CPU-side control message passing is handled via the `send_obj` and `recv_obj` methods on the CPU side. The intention was that this interface should only be exposed to a private network using the IP address specified by the `--kv-ip` CLI parameter. The vLLM documentation covers how this must be limited to a secured network. The default and intentional behavior from PyTorch is that the `TCPStore` interface listens on ALL interfaces, regardless of what IP address is provided. The IP address given was only used as a client-side address to use. vLLM was fixed to use a workaround to force the `TCPStore` instance to bind its socket to a specified private interface. As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured.
AI Analysis
Technical Summary
CVE-2025-47277 is a critical vulnerability affecting the vLLM project, specifically versions 0.6.5 through 0.8.4, in environments that utilize the PyNcclPipe KV cache transfer integration with the V0 engine. vLLM is an inference and serving engine designed for large language models (LLMs), which supports distributed GPU-based computation. The vulnerability arises from improper network interface binding in the TCPStore component used for CPU-side control message passing via send_obj and recv_obj methods. Although the vLLM documentation instructs that the PyNcclPipe interface should be restricted to a private network specified by the --kv-ip CLI parameter, the underlying PyTorch TCPStore interface listens on all network interfaces by default, ignoring the intended IP restriction. This misconfiguration exposes the TCPStore socket to potentially untrusted networks, allowing unauthenticated remote attackers to send malicious serialized data to the service. The vulnerability is classified as CWE-502 (Deserialization of Untrusted Data), which can lead to remote code execution, data tampering, or denial of service. The CVSS v3.1 score is 9.8 (critical), reflecting the high impact on confidentiality, integrity, and availability, with no authentication or user interaction required and network-level exploitability. The issue was addressed in vLLM version 0.8.5 by forcing the TCPStore instance to bind explicitly to the specified private interface, mitigating exposure to untrusted networks. No known exploits are currently reported in the wild, but the severity and ease of exploitation make this a significant threat for deployments using the vulnerable versions with PyNcclPipe enabled.
Potential Impact
For European organizations leveraging vLLM for large language model inference, particularly those using distributed GPU clusters with the PyNcclPipe KV cache transfer integration, this vulnerability poses a severe risk. Exploitation could allow remote attackers to execute arbitrary code, manipulate inference results, or disrupt service availability without any authentication. This could lead to data breaches, intellectual property theft, or operational downtime. Given the increasing adoption of AI and LLM technologies in sectors such as finance, healthcare, research, and government across Europe, the impact could be substantial. Confidentiality breaches could expose sensitive data processed by LLMs, while integrity violations might corrupt AI model outputs, undermining decision-making processes. Availability impacts could disrupt critical AI-driven services. The vulnerability’s network exposure means that organizations with insufficient network segmentation or those exposing internal AI infrastructure to broader networks are particularly at risk. Furthermore, the complexity of distributed AI environments may delay detection and remediation, increasing the window of opportunity for attackers.
Mitigation Recommendations
European organizations should immediately upgrade affected vLLM deployments to version 0.8.5 or later, where the TCPStore socket binding issue is resolved. Until upgrades are possible, organizations must enforce strict network segmentation to isolate the PyNcclPipe interface on a secured private network, ensuring it is not reachable from untrusted or public networks. Implement firewall rules to restrict access to the TCPStore port to only trusted hosts within the private network. Additionally, monitor network traffic for unexpected connections or serialized data transmissions to the PyNcclPipe interface. Employ runtime application self-protection (RASP) or endpoint detection and response (EDR) solutions capable of detecting anomalous deserialization activities. Review and harden the configuration of PyTorch and related dependencies to ensure no unintended exposure of TCPStore interfaces. Finally, conduct regular audits and penetration testing focused on AI infrastructure to identify and remediate similar misconfigurations proactively.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Switzerland, Belgium
CVE-2025-47277: CWE-502: Deserialization of Untrusted Data in vllm-project vllm
Description
vLLM, an inference and serving engine for large language models (LLMs), has an issue in versions 0.6.5 through 0.8.4 that ONLY impacts environments using the `PyNcclPipe` KV cache transfer integration with the V0 engine. No other configurations are affected. vLLM supports the use of the `PyNcclPipe` class to establish a peer-to-peer communication domain for data transmission between distributed nodes. The GPU-side KV-Cache transmission is implemented through the `PyNcclCommunicator` class, while CPU-side control message passing is handled via the `send_obj` and `recv_obj` methods on the CPU side. The intention was that this interface should only be exposed to a private network using the IP address specified by the `--kv-ip` CLI parameter. The vLLM documentation covers how this must be limited to a secured network. The default and intentional behavior from PyTorch is that the `TCPStore` interface listens on ALL interfaces, regardless of what IP address is provided. The IP address given was only used as a client-side address to use. vLLM was fixed to use a workaround to force the `TCPStore` instance to bind its socket to a specified private interface. As of version 0.8.5, vLLM limits the `TCPStore` socket to the private interface as configured.
AI-Powered Analysis
Technical Analysis
CVE-2025-47277 is a critical vulnerability affecting the vLLM project, specifically versions 0.6.5 through 0.8.4, in environments that utilize the PyNcclPipe KV cache transfer integration with the V0 engine. vLLM is an inference and serving engine designed for large language models (LLMs), which supports distributed GPU-based computation. The vulnerability arises from improper network interface binding in the TCPStore component used for CPU-side control message passing via send_obj and recv_obj methods. Although the vLLM documentation instructs that the PyNcclPipe interface should be restricted to a private network specified by the --kv-ip CLI parameter, the underlying PyTorch TCPStore interface listens on all network interfaces by default, ignoring the intended IP restriction. This misconfiguration exposes the TCPStore socket to potentially untrusted networks, allowing unauthenticated remote attackers to send malicious serialized data to the service. The vulnerability is classified as CWE-502 (Deserialization of Untrusted Data), which can lead to remote code execution, data tampering, or denial of service. The CVSS v3.1 score is 9.8 (critical), reflecting the high impact on confidentiality, integrity, and availability, with no authentication or user interaction required and network-level exploitability. The issue was addressed in vLLM version 0.8.5 by forcing the TCPStore instance to bind explicitly to the specified private interface, mitigating exposure to untrusted networks. No known exploits are currently reported in the wild, but the severity and ease of exploitation make this a significant threat for deployments using the vulnerable versions with PyNcclPipe enabled.
Potential Impact
For European organizations leveraging vLLM for large language model inference, particularly those using distributed GPU clusters with the PyNcclPipe KV cache transfer integration, this vulnerability poses a severe risk. Exploitation could allow remote attackers to execute arbitrary code, manipulate inference results, or disrupt service availability without any authentication. This could lead to data breaches, intellectual property theft, or operational downtime. Given the increasing adoption of AI and LLM technologies in sectors such as finance, healthcare, research, and government across Europe, the impact could be substantial. Confidentiality breaches could expose sensitive data processed by LLMs, while integrity violations might corrupt AI model outputs, undermining decision-making processes. Availability impacts could disrupt critical AI-driven services. The vulnerability’s network exposure means that organizations with insufficient network segmentation or those exposing internal AI infrastructure to broader networks are particularly at risk. Furthermore, the complexity of distributed AI environments may delay detection and remediation, increasing the window of opportunity for attackers.
Mitigation Recommendations
European organizations should immediately upgrade affected vLLM deployments to version 0.8.5 or later, where the TCPStore socket binding issue is resolved. Until upgrades are possible, organizations must enforce strict network segmentation to isolate the PyNcclPipe interface on a secured private network, ensuring it is not reachable from untrusted or public networks. Implement firewall rules to restrict access to the TCPStore port to only trusted hosts within the private network. Additionally, monitor network traffic for unexpected connections or serialized data transmissions to the PyNcclPipe interface. Employ runtime application self-protection (RASP) or endpoint detection and response (EDR) solutions capable of detecting anomalous deserialization activities. Review and harden the configuration of PyTorch and related dependencies to ensure no unintended exposure of TCPStore interfaces. Finally, conduct regular audits and penetration testing focused on AI infrastructure to identify and remediate similar misconfigurations proactively.
Affected Countries
For access to advanced analysis and higher rate limits, contact root@offseq.com
Technical Details
- Data Version
- 5.1
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2025-05-05T16:53:10.373Z
- Cisa Enriched
- true
- Cvss Version
- 3.1
- State
- PUBLISHED
Threat ID: 682cd0f71484d88663aeacae
Added to database: 5/20/2025, 6:59:03 PM
Last enriched: 7/11/2025, 12:49:15 PM
Last updated: 8/18/2025, 1:08:29 AM
Views: 19
Related Threats
CVE-2025-53948: CWE-415 Double Free in Santesoft Sante PACS Server
HighCVE-2025-52584: CWE-122 Heap-based Buffer Overflow in Ashlar-Vellum Cobalt
HighCVE-2025-46269: CWE-122 Heap-based Buffer Overflow in Ashlar-Vellum Cobalt
HighCVE-2025-54862: CWE-79 Improper Neutralization of Input During Web Page Generation (XSS or 'Cross-site Scripting') in Santesoft Sante PACS Server
MediumCVE-2025-54759: CWE-79 Improper Neutralization of Input During Web Page Generation (XSS or 'Cross-site Scripting') in Santesoft Sante PACS Server
MediumActions
Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.
Need enhanced features?
Contact root@offseq.com for Pro access with improved analysis and higher rate limits.