Skip to main content
Press slash or control plus K to focus the search. Use the arrow keys to navigate results and press enter to open a threat.
Reconnecting to live updates…

CVE-2024-14021: CWE-502 Deserialization of Untrusted Data in run-llama llama_index

0
High
VulnerabilityCVE-2024-14021cvecve-2024-14021cwe-502
Published: Mon Jan 12 2026 (01/12/2026, 23:04:43 UTC)
Source: CVE Database V5
Vendor/Project: run-llama
Product: llama_index

Description

LlamaIndex (run-llama/llama_index) versions up to and including 0.11.6 contain an unsafe deserialization vulnerability in BGEM3Index.load_from_disk() in llama_index/indices/managed/bge_m3/base.py. The function uses pickle.load() to deserialize multi_embed_store.pkl from a user-supplied persist_dir without validation. An attacker who can provide a crafted persist directory containing a malicious pickle file can trigger arbitrary code execution when the victim loads the index from disk.

AI-Powered Analysis

AILast updated: 01/12/2026, 23:38:32 UTC

Technical Analysis

CVE-2024-14021 is a deserialization of untrusted data vulnerability (CWE-502) found in the run-llama project's llama_index library, specifically affecting versions up to and including 0.11.6. The issue exists in the BGEM3Index.load_from_disk() function located in llama_index/indices/managed/bge_m3/base.py, which uses Python's pickle.load() to deserialize a file named multi_embed_store.pkl from a user-supplied persist_dir. Because pickle.load() can execute arbitrary code during deserialization, if an attacker can control the contents of the persist_dir and provide a maliciously crafted pickle file, they can trigger arbitrary code execution on the victim's system when the index is loaded. This vulnerability does not require prior authentication but does require that the attacker can influence or supply the persist directory contents and that the victim loads the index, implying some user interaction. The CVSS 4.0 score is 8.4 (high severity), reflecting the potential for high confidentiality, integrity, and availability impact due to arbitrary code execution. No patches or fixes are currently linked, and no known exploits have been reported in the wild. The vulnerability is particularly relevant for environments where llama_index is used to manage or index data, especially in AI or machine learning workflows that rely on persistent storage of embeddings or indexes.

Potential Impact

For European organizations, this vulnerability poses a significant risk if they use the vulnerable versions of llama_index in their AI, data indexing, or machine learning pipelines. Successful exploitation could lead to arbitrary code execution, allowing attackers to compromise confidentiality by accessing sensitive data, integrity by modifying or corrupting data, and availability by disrupting services or deleting data. This could result in data breaches, operational disruptions, or further lateral movement within networks. Organizations in sectors such as finance, healthcare, research, and critical infrastructure that rely on AI tools and data indexing are particularly at risk. The requirement for supplying a malicious persist directory limits remote exploitation but does not eliminate risk in environments where untrusted data sources or shared storage are used. The absence of known exploits suggests a window for proactive mitigation before active attacks occur.

Mitigation Recommendations

European organizations should immediately audit their use of the llama_index library and identify any deployments using versions up to 0.11.6. Until a patch is available, they should avoid loading indexes from untrusted or user-supplied persist directories. Implement strict validation and sanitization of any input directories or files used for deserialization. Consider replacing pickle-based deserialization with safer alternatives such as JSON or other secure serialization libraries that do not allow code execution. Employ application-level controls to restrict which users or processes can supply or modify persist directories. Use endpoint protection and monitoring to detect anomalous file modifications or suspicious process executions related to llama_index. Maintain network segmentation to limit the impact of potential exploitation. Stay updated with vendor advisories for patches or updates addressing this vulnerability and apply them promptly once released.

Need more detailed analysis?Upgrade to Pro Console

Technical Details

Data Version
5.2
Assigner Short Name
VulnCheck
Date Reserved
2026-01-09T20:42:56.495Z
Cvss Version
4.0
State
PUBLISHED

Threat ID: 69658281da2266e838450d16

Added to database: 1/12/2026, 11:23:45 PM

Last enriched: 1/12/2026, 11:38:32 PM

Last updated: 1/13/2026, 1:30:59 AM

Views: 9

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by
Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.

Actions

PRO

Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.

Please log in to the Console to use AI analysis features.

Need more coverage?

Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.

For incident response and remediation, OffSeq services can help resolve threats faster.

Latest Threats