CVE-2026-27940: CWE-122: Heap-based Buffer Overflow in ggml-org llama.cpp
llama.cpp is an inference of several LLM models in C/C++. Prior to b8146, the gguf_init_from_file_impl() in gguf.cpp is vulnerable to an Integer overflow, leading to an undersized heap allocation. Using the subsequent fread() writes 528+ bytes of attacker-controlled data past the buffer boundary. This is a bypass of a similar bug in the same file - CVE-2025-53630, but the fix overlooked some areas. This vulnerability is fixed in b8146.
AI Analysis
Technical Summary
CVE-2026-27940 is a heap-based buffer overflow vulnerability identified in the llama.cpp project maintained by ggml-org, which is a C/C++ implementation for inference of large language models (LLMs). The vulnerability exists in the function gguf_init_from_file_impl() within gguf.cpp prior to commit b8146. The root cause is an integer overflow (CWE-190) that leads to an undersized heap allocation. Subsequently, the function fread() reads and writes at least 528 bytes of attacker-controlled data beyond the allocated buffer boundary, resulting in a heap overflow (CWE-122). This flaw effectively bypasses a prior patch for a similar vulnerability (CVE-2025-53630) because the fix did not cover all vulnerable code paths. The vulnerability requires local access (AV:L) and no privileges (PR:N), but user interaction (UI:R) is necessary to trigger the overflow. The impact is severe, allowing an attacker to potentially execute arbitrary code, corrupt memory, or cause denial of service, affecting confidentiality, integrity, and availability. The vulnerability has a CVSS v3.1 base score of 7.8, reflecting its high severity. Although no exploits are currently known in the wild, the vulnerability is critical for users of llama.cpp versions prior to b8146. The issue was publicly disclosed on March 12, 2026, and fixed in the b8146 commit. The vulnerability highlights the importance of thorough patch validation to avoid incomplete fixes.
Potential Impact
The impact of CVE-2026-27940 on organizations worldwide is significant due to the widespread adoption of llama.cpp for running LLM inference locally or in enterprise environments. Successful exploitation can lead to arbitrary code execution, enabling attackers to compromise system confidentiality by accessing sensitive data processed by the model, integrity by altering model behavior or outputs, and availability by crashing or destabilizing the application. Since llama.cpp is often integrated into AI workflows, compromised systems could be used as pivot points for lateral movement or data exfiltration. The requirement for local access and user interaction limits remote exploitation but does not eliminate risk, especially in environments where untrusted users have local machine access or where malicious files are opened. Organizations relying on vulnerable versions risk operational disruption, intellectual property theft, and potential regulatory compliance violations if sensitive data is exposed. The incomplete fix from the prior CVE also suggests a need for careful code auditing and patch management to prevent similar oversights.
Mitigation Recommendations
To mitigate CVE-2026-27940, organizations should immediately update llama.cpp to version b8146 or later, where the vulnerability is fully patched. Users should audit their deployment pipelines to ensure no legacy versions remain in use. Implement strict local access controls to limit who can execute or interact with llama.cpp binaries, reducing the risk of exploitation via user interaction. Employ application whitelisting and endpoint protection to detect anomalous behavior indicative of exploitation attempts. Conduct thorough code reviews and fuzz testing on components handling untrusted input to identify similar integer overflow and buffer overflow issues proactively. Additionally, sandbox llama.cpp execution environments to contain potential compromises and monitor logs for unusual fread() or memory allocation errors. Educate users about the risks of opening untrusted files or inputs that could trigger the vulnerability. Finally, maintain an up-to-date inventory of AI inference tools and their versions to facilitate rapid response to future vulnerabilities.
Affected Countries
United States, China, Germany, United Kingdom, Japan, South Korea, France, Canada, Australia, India
CVE-2026-27940: CWE-122: Heap-based Buffer Overflow in ggml-org llama.cpp
Description
llama.cpp is an inference of several LLM models in C/C++. Prior to b8146, the gguf_init_from_file_impl() in gguf.cpp is vulnerable to an Integer overflow, leading to an undersized heap allocation. Using the subsequent fread() writes 528+ bytes of attacker-controlled data past the buffer boundary. This is a bypass of a similar bug in the same file - CVE-2025-53630, but the fix overlooked some areas. This vulnerability is fixed in b8146.
AI-Powered Analysis
Machine-generated threat intelligence
Technical Analysis
CVE-2026-27940 is a heap-based buffer overflow vulnerability identified in the llama.cpp project maintained by ggml-org, which is a C/C++ implementation for inference of large language models (LLMs). The vulnerability exists in the function gguf_init_from_file_impl() within gguf.cpp prior to commit b8146. The root cause is an integer overflow (CWE-190) that leads to an undersized heap allocation. Subsequently, the function fread() reads and writes at least 528 bytes of attacker-controlled data beyond the allocated buffer boundary, resulting in a heap overflow (CWE-122). This flaw effectively bypasses a prior patch for a similar vulnerability (CVE-2025-53630) because the fix did not cover all vulnerable code paths. The vulnerability requires local access (AV:L) and no privileges (PR:N), but user interaction (UI:R) is necessary to trigger the overflow. The impact is severe, allowing an attacker to potentially execute arbitrary code, corrupt memory, or cause denial of service, affecting confidentiality, integrity, and availability. The vulnerability has a CVSS v3.1 base score of 7.8, reflecting its high severity. Although no exploits are currently known in the wild, the vulnerability is critical for users of llama.cpp versions prior to b8146. The issue was publicly disclosed on March 12, 2026, and fixed in the b8146 commit. The vulnerability highlights the importance of thorough patch validation to avoid incomplete fixes.
Potential Impact
The impact of CVE-2026-27940 on organizations worldwide is significant due to the widespread adoption of llama.cpp for running LLM inference locally or in enterprise environments. Successful exploitation can lead to arbitrary code execution, enabling attackers to compromise system confidentiality by accessing sensitive data processed by the model, integrity by altering model behavior or outputs, and availability by crashing or destabilizing the application. Since llama.cpp is often integrated into AI workflows, compromised systems could be used as pivot points for lateral movement or data exfiltration. The requirement for local access and user interaction limits remote exploitation but does not eliminate risk, especially in environments where untrusted users have local machine access or where malicious files are opened. Organizations relying on vulnerable versions risk operational disruption, intellectual property theft, and potential regulatory compliance violations if sensitive data is exposed. The incomplete fix from the prior CVE also suggests a need for careful code auditing and patch management to prevent similar oversights.
Mitigation Recommendations
To mitigate CVE-2026-27940, organizations should immediately update llama.cpp to version b8146 or later, where the vulnerability is fully patched. Users should audit their deployment pipelines to ensure no legacy versions remain in use. Implement strict local access controls to limit who can execute or interact with llama.cpp binaries, reducing the risk of exploitation via user interaction. Employ application whitelisting and endpoint protection to detect anomalous behavior indicative of exploitation attempts. Conduct thorough code reviews and fuzz testing on components handling untrusted input to identify similar integer overflow and buffer overflow issues proactively. Additionally, sandbox llama.cpp execution environments to contain potential compromises and monitor logs for unusual fread() or memory allocation errors. Educate users about the risks of opening untrusted files or inputs that could trigger the vulnerability. Finally, maintain an up-to-date inventory of AI inference tools and their versions to facilitate rapid response to future vulnerabilities.
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2026-02-25T03:11:36.689Z
- Cvss Version
- 3.1
- State
- PUBLISHED
Threat ID: 69b30a4f2f860ef943dbd359
Added to database: 3/12/2026, 6:47:43 PM
Last enriched: 3/20/2026, 2:22:00 AM
Last updated: 4/28/2026, 7:22:27 AM
Views: 58
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Actions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
External Links
Need more coverage?
Upgrade to Pro Console for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.
Latest Threats
Check if your credentials are on the dark web
Instant breach scanning across billions of leaked records. Free tier available.