CVE-2025-23254: CWE-502 Deserialization of Untrusted Data in NVIDIA TensorRT-LLM
NVIDIA TensorRT-LLM for any platform contains a vulnerability in python executor where an attacker may cause a data validation issue by local access to the TRTLLM server. A successful exploit of this vulnerability may lead to code execution, information disclosure and data tampering.
AI Analysis
Technical Summary
CVE-2025-23254 is a high-severity vulnerability affecting NVIDIA's TensorRT-LLM product, specifically versions prior to 0.18.2. The vulnerability is classified under CWE-502, which pertains to deserialization of untrusted data. It exists within the Python executor component of TensorRT-LLM, where improper data validation allows an attacker with local access to the TRTLLM server to exploit this flaw. Deserialization vulnerabilities occur when untrusted input is deserialized without sufficient validation, potentially enabling attackers to execute arbitrary code, disclose sensitive information, or tamper with data. In this case, the attacker must have local access and at least low privileges (as indicated by the CVSS vector AV:L/PR:L), but no user interaction is required. The vulnerability has a CVSS 3.1 base score of 8.8, reflecting its high impact on confidentiality, integrity, and availability, and the scope is changed, meaning the exploit can affect resources beyond the initially vulnerable component. Although no known exploits are currently reported in the wild, the potential for remote code execution and data compromise makes this a critical concern for organizations using TensorRT-LLM. The lack of available patches at the time of publication increases the urgency for mitigation and monitoring. TensorRT-LLM is a platform-agnostic AI inference engine optimized for large language models, widely used in AI research, development, and deployment environments.
Potential Impact
For European organizations, the impact of this vulnerability can be significant, especially for those leveraging NVIDIA TensorRT-LLM in AI and machine learning workflows. Successful exploitation could lead to unauthorized code execution on servers running TensorRT-LLM, potentially allowing attackers to escalate privileges, move laterally within networks, or exfiltrate sensitive data. This poses risks to intellectual property, proprietary AI models, and any data processed by these systems. Given the high confidentiality, integrity, and availability impacts, critical AI infrastructure could be disrupted, affecting research institutions, technology companies, and industries relying on AI-driven analytics or automation. The requirement for local access limits remote exploitation but does not eliminate risk, as insider threats or compromised internal systems could be leveraged. Additionally, the changed scope of the vulnerability means that the impact could extend beyond the immediate application, potentially affecting other system components or services integrated with TensorRT-LLM. Organizations in sectors such as finance, healthcare, automotive, and defense, which increasingly adopt AI technologies, may face operational disruptions and regulatory compliance challenges if this vulnerability is exploited.
Mitigation Recommendations
To mitigate this vulnerability, European organizations should: 1) Immediately identify and inventory all instances of NVIDIA TensorRT-LLM in their environments, focusing on versions prior to 0.18.2. 2) Apply updates or patches as soon as NVIDIA releases a fixed version; until then, consider disabling or restricting access to the Python executor component if feasible. 3) Enforce strict access controls to limit local access to TRTLLM servers, ensuring only trusted administrators and processes have permissions. 4) Implement robust monitoring and logging on systems running TensorRT-LLM to detect unusual activities indicative of exploitation attempts, such as unexpected process executions or data access patterns. 5) Use application whitelisting and endpoint protection solutions to prevent unauthorized code execution. 6) Conduct regular security audits and penetration testing focusing on AI infrastructure to identify potential lateral movement paths. 7) Educate internal teams about the risks of deserialization vulnerabilities and the importance of securing local access points. 8) Where possible, isolate AI inference servers in segmented network zones to reduce the attack surface. These steps go beyond generic advice by emphasizing inventory management, access restriction, and targeted monitoring specific to the TensorRT-LLM environment.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Switzerland
CVE-2025-23254: CWE-502 Deserialization of Untrusted Data in NVIDIA TensorRT-LLM
Description
NVIDIA TensorRT-LLM for any platform contains a vulnerability in python executor where an attacker may cause a data validation issue by local access to the TRTLLM server. A successful exploit of this vulnerability may lead to code execution, information disclosure and data tampering.
AI-Powered Analysis
Technical Analysis
CVE-2025-23254 is a high-severity vulnerability affecting NVIDIA's TensorRT-LLM product, specifically versions prior to 0.18.2. The vulnerability is classified under CWE-502, which pertains to deserialization of untrusted data. It exists within the Python executor component of TensorRT-LLM, where improper data validation allows an attacker with local access to the TRTLLM server to exploit this flaw. Deserialization vulnerabilities occur when untrusted input is deserialized without sufficient validation, potentially enabling attackers to execute arbitrary code, disclose sensitive information, or tamper with data. In this case, the attacker must have local access and at least low privileges (as indicated by the CVSS vector AV:L/PR:L), but no user interaction is required. The vulnerability has a CVSS 3.1 base score of 8.8, reflecting its high impact on confidentiality, integrity, and availability, and the scope is changed, meaning the exploit can affect resources beyond the initially vulnerable component. Although no known exploits are currently reported in the wild, the potential for remote code execution and data compromise makes this a critical concern for organizations using TensorRT-LLM. The lack of available patches at the time of publication increases the urgency for mitigation and monitoring. TensorRT-LLM is a platform-agnostic AI inference engine optimized for large language models, widely used in AI research, development, and deployment environments.
Potential Impact
For European organizations, the impact of this vulnerability can be significant, especially for those leveraging NVIDIA TensorRT-LLM in AI and machine learning workflows. Successful exploitation could lead to unauthorized code execution on servers running TensorRT-LLM, potentially allowing attackers to escalate privileges, move laterally within networks, or exfiltrate sensitive data. This poses risks to intellectual property, proprietary AI models, and any data processed by these systems. Given the high confidentiality, integrity, and availability impacts, critical AI infrastructure could be disrupted, affecting research institutions, technology companies, and industries relying on AI-driven analytics or automation. The requirement for local access limits remote exploitation but does not eliminate risk, as insider threats or compromised internal systems could be leveraged. Additionally, the changed scope of the vulnerability means that the impact could extend beyond the immediate application, potentially affecting other system components or services integrated with TensorRT-LLM. Organizations in sectors such as finance, healthcare, automotive, and defense, which increasingly adopt AI technologies, may face operational disruptions and regulatory compliance challenges if this vulnerability is exploited.
Mitigation Recommendations
To mitigate this vulnerability, European organizations should: 1) Immediately identify and inventory all instances of NVIDIA TensorRT-LLM in their environments, focusing on versions prior to 0.18.2. 2) Apply updates or patches as soon as NVIDIA releases a fixed version; until then, consider disabling or restricting access to the Python executor component if feasible. 3) Enforce strict access controls to limit local access to TRTLLM servers, ensuring only trusted administrators and processes have permissions. 4) Implement robust monitoring and logging on systems running TensorRT-LLM to detect unusual activities indicative of exploitation attempts, such as unexpected process executions or data access patterns. 5) Use application whitelisting and endpoint protection solutions to prevent unauthorized code execution. 6) Conduct regular security audits and penetration testing focusing on AI infrastructure to identify potential lateral movement paths. 7) Educate internal teams about the risks of deserialization vulnerabilities and the importance of securing local access points. 8) Where possible, isolate AI inference servers in segmented network zones to reduce the attack surface. These steps go beyond generic advice by emphasizing inventory management, access restriction, and targeted monitoring specific to the TensorRT-LLM environment.
Affected Countries
For access to advanced analysis and higher rate limits, contact root@offseq.com
Technical Details
- Data Version
- 5.1
- Assigner Short Name
- nvidia
- Date Reserved
- 2025-01-14T01:06:22.262Z
- Cisa Enriched
- true
- Cvss Version
- 3.1
- State
- PUBLISHED
Threat ID: 682d9839c4522896dcbecbdc
Added to database: 5/21/2025, 9:09:13 AM
Last enriched: 6/25/2025, 7:45:01 PM
Last updated: 8/16/2025, 5:50:22 PM
Views: 17
Related Threats
CVE-2025-8567: CWE-79 Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting') in posimyththemes Nexter Blocks – WordPress Gutenberg Blocks & 1000+ Starter Templates
MediumCVE-2025-41689: CWE-306 Missing Authentication for Critical Function in Wiesemann & Theis Motherbox 3
MediumCVE-2025-41685: CWE-359 Exposure of Private Personal Information to an Unauthorized Actor in SMA ennexos.sunnyportal.com
MediumCVE-2025-8723: CWE-94 Improper Control of Generation of Code ('Code Injection') in mecanik Cloudflare Image Resizing – Optimize & Accelerate Your Images
CriticalCVE-2025-8622: CWE-79 Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting') in webaware Flexible Map
MediumActions
Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.
External Links
Need enhanced features?
Contact root@offseq.com for Pro access with improved analysis and higher rate limits.