Skip to main content

CVE-2025-46570: CWE-208: Observable Timing Discrepancy in vllm-project vllm

Low
VulnerabilityCVE-2025-46570cvecve-2025-46570cwe-208
Published: Thu May 29 2025 (05/29/2025, 16:32:42 UTC)
Source: CVE Database V5
Vendor/Project: vllm-project
Product: vllm

Description

vLLM is an inference and serving engine for large language models (LLMs). Prior to version 0.9.0, when a new prompt is processed, if the PageAttention mechanism finds a matching prefix chunk, the prefill process speeds up, which is reflected in the TTFT (Time to First Token). These timing differences caused by matching chunks are significant enough to be recognized and exploited. This issue has been patched in version 0.9.0.

AI-Powered Analysis

AILast updated: 07/07/2025, 23:09:48 UTC

Technical Analysis

CVE-2025-46570 is a timing side-channel vulnerability identified in the vLLM inference and serving engine for large language models (LLMs), specifically in versions prior to 0.9.0. The vulnerability arises from the PageAttention mechanism used during prompt processing. When a new prompt is processed, if the PageAttention mechanism detects a matching prefix chunk from previous inputs, it accelerates the prefill process, resulting in a noticeably faster Time to First Token (TTFT). This timing discrepancy is observable and can be exploited by an attacker to infer information about the prompt content or the internal state of the model serving process. The vulnerability is categorized under CWE-208, which relates to observable timing discrepancies that can leak sensitive information. The issue has been addressed and patched in version 0.9.0 of vLLM. The CVSS v3.1 base score is 2.6, indicating a low severity level. The vector indicates that the attack requires network access (AV:N), high attack complexity (AC:H), low privileges (PR:L), and user interaction (UI:R), with an impact limited to confidentiality (C:L) and no impact on integrity or availability. No known exploits are reported in the wild at this time.

Potential Impact

For European organizations utilizing vLLM versions prior to 0.9.0, this vulnerability could potentially allow attackers to glean partial information about the prompts being processed by observing timing differences in responses. While the confidentiality impact is low, in sensitive environments where prompt content may include proprietary, personal, or confidential data, even limited leakage could be problematic. The vulnerability does not affect integrity or availability, so operational disruption or data manipulation risks are minimal. However, organizations deploying LLM inference services in sectors such as finance, healthcare, or government where data sensitivity is high should consider this risk seriously. The requirement for user interaction and low privileges reduces the likelihood of widespread exploitation, but targeted attacks remain possible, especially in multi-tenant or cloud-hosted inference environments common in Europe.

Mitigation Recommendations

European organizations should upgrade all vLLM deployments to version 0.9.0 or later to eliminate this timing side-channel vulnerability. For environments where immediate upgrading is not feasible, consider implementing network-level controls to restrict access to the inference service, limiting exposure to untrusted users. Monitoring and logging of inference request patterns may help detect anomalous probing attempts exploiting timing differences. Additionally, organizations can introduce artificial delays or jitter in response times to obscure timing discrepancies, though this may impact performance. For highly sensitive use cases, isolating inference workloads and employing strict authentication and authorization mechanisms can further reduce risk. Finally, maintain awareness of vendor updates and security advisories related to vLLM and similar LLM serving engines.

Need more detailed analysis?Get Pro

Technical Details

Data Version
5.1
Assigner Short Name
GitHub_M
Date Reserved
2025-04-24T21:10:48.175Z
Cvss Version
3.1
State
PUBLISHED

Threat ID: 68388f0b182aa0cae285909c

Added to database: 5/29/2025, 4:44:59 PM

Last enriched: 7/7/2025, 11:09:48 PM

Last updated: 8/10/2025, 3:03:35 PM

Views: 9

Actions

PRO

Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.

Please log in to the Console to use AI analysis features.

Need enhanced features?

Contact root@offseq.com for Pro access with improved analysis and higher rate limits.

Latest Threats