CVE-2026-27940: CWE-122: Heap-based Buffer Overflow in ggml-org llama.cpp
llama.cpp is an inference of several LLM models in C/C++. Prior to b8146, the gguf_init_from_file_impl() in gguf.cpp is vulnerable to an Integer overflow, leading to an undersized heap allocation. Using the subsequent fread() writes 528+ bytes of attacker-controlled data past the buffer boundary. This is a bypass of a similar bug in the same file - CVE-2025-53630, but the fix overlooked some areas. This vulnerability is fixed in b8146.
AI Analysis
Technical Summary
CVE-2026-27940 is a heap-based buffer overflow vulnerability identified in the llama.cpp project maintained by ggml-org, which provides C/C++ implementations for inference of large language models (LLMs). The root cause is an integer overflow in the function gguf_init_from_file_impl() within gguf.cpp. This integer overflow results in an undersized heap allocation, which is insufficient to hold the data read by a subsequent fread() call. Specifically, fread() writes at least 528 bytes of attacker-controlled data beyond the allocated buffer boundary, causing a heap overflow. This vulnerability is a bypass of a previously patched similar bug (CVE-2025-53630), where the fix did not cover all vulnerable code paths. The flaw allows an attacker to corrupt heap memory, potentially leading to arbitrary code execution, privilege escalation, or denial of service. The CVSS v3.1 base score is 7.8 (high), reflecting high impact on confidentiality, integrity, and availability. The attack vector is local (AV:L), requiring user interaction (UI:R) but no privileges (PR:N). The vulnerability affects all versions of llama.cpp prior to commit b8146, where the issue was resolved. No public exploits have been observed yet, but the nature of the bug and the widespread use of llama.cpp in AI applications make it a significant risk. The vulnerability is tagged under CWE-122 (Heap-based Buffer Overflow) and CWE-190 (Integer Overflow or Wraparound).
Potential Impact
The vulnerability poses a serious risk to organizations using llama.cpp for LLM inference, especially those integrating it into local or embedded AI systems. Exploitation can lead to arbitrary code execution, allowing attackers to execute malicious payloads with the privileges of the running process. This can compromise confidentiality by leaking sensitive data processed by the model, integrity by altering model outputs or internal state, and availability by crashing or destabilizing the application. Since the attack requires local access and user interaction, insider threats or compromised user accounts are primary concerns. The flaw could also be leveraged in multi-tenant environments or developer machines to escalate privileges or pivot attacks. Given the increasing adoption of llama.cpp in AI research, development, and production, the vulnerability could impact a broad range of sectors including technology companies, research institutions, and cloud providers offering AI services. Failure to patch could result in data breaches, service outages, and reputational damage.
Mitigation Recommendations
Organizations should immediately upgrade llama.cpp to version b8146 or later, where the vulnerability is fixed. If upgrading is not immediately feasible, restrict access to systems running vulnerable versions to trusted users only and monitor for unusual activity. Conduct thorough code reviews and static analysis on any custom modifications of llama.cpp to ensure no similar integer overflow or buffer overflow issues exist. Employ runtime protections such as heap canaries, address space layout randomization (ASLR), and control flow integrity (CFI) to mitigate exploitation impact. Implement strict input validation and sandboxing for processes handling untrusted model files. Regularly audit logs for signs of exploitation attempts and educate developers and users about the risks of opening untrusted files. Finally, maintain an incident response plan tailored to AI infrastructure compromises.
Affected Countries
United States, China, Germany, United Kingdom, Canada, France, Japan, South Korea, India, Australia
CVE-2026-27940: CWE-122: Heap-based Buffer Overflow in ggml-org llama.cpp
Description
llama.cpp is an inference of several LLM models in C/C++. Prior to b8146, the gguf_init_from_file_impl() in gguf.cpp is vulnerable to an Integer overflow, leading to an undersized heap allocation. Using the subsequent fread() writes 528+ bytes of attacker-controlled data past the buffer boundary. This is a bypass of a similar bug in the same file - CVE-2025-53630, but the fix overlooked some areas. This vulnerability is fixed in b8146.
AI-Powered Analysis
Technical Analysis
CVE-2026-27940 is a heap-based buffer overflow vulnerability identified in the llama.cpp project maintained by ggml-org, which provides C/C++ implementations for inference of large language models (LLMs). The root cause is an integer overflow in the function gguf_init_from_file_impl() within gguf.cpp. This integer overflow results in an undersized heap allocation, which is insufficient to hold the data read by a subsequent fread() call. Specifically, fread() writes at least 528 bytes of attacker-controlled data beyond the allocated buffer boundary, causing a heap overflow. This vulnerability is a bypass of a previously patched similar bug (CVE-2025-53630), where the fix did not cover all vulnerable code paths. The flaw allows an attacker to corrupt heap memory, potentially leading to arbitrary code execution, privilege escalation, or denial of service. The CVSS v3.1 base score is 7.8 (high), reflecting high impact on confidentiality, integrity, and availability. The attack vector is local (AV:L), requiring user interaction (UI:R) but no privileges (PR:N). The vulnerability affects all versions of llama.cpp prior to commit b8146, where the issue was resolved. No public exploits have been observed yet, but the nature of the bug and the widespread use of llama.cpp in AI applications make it a significant risk. The vulnerability is tagged under CWE-122 (Heap-based Buffer Overflow) and CWE-190 (Integer Overflow or Wraparound).
Potential Impact
The vulnerability poses a serious risk to organizations using llama.cpp for LLM inference, especially those integrating it into local or embedded AI systems. Exploitation can lead to arbitrary code execution, allowing attackers to execute malicious payloads with the privileges of the running process. This can compromise confidentiality by leaking sensitive data processed by the model, integrity by altering model outputs or internal state, and availability by crashing or destabilizing the application. Since the attack requires local access and user interaction, insider threats or compromised user accounts are primary concerns. The flaw could also be leveraged in multi-tenant environments or developer machines to escalate privileges or pivot attacks. Given the increasing adoption of llama.cpp in AI research, development, and production, the vulnerability could impact a broad range of sectors including technology companies, research institutions, and cloud providers offering AI services. Failure to patch could result in data breaches, service outages, and reputational damage.
Mitigation Recommendations
Organizations should immediately upgrade llama.cpp to version b8146 or later, where the vulnerability is fixed. If upgrading is not immediately feasible, restrict access to systems running vulnerable versions to trusted users only and monitor for unusual activity. Conduct thorough code reviews and static analysis on any custom modifications of llama.cpp to ensure no similar integer overflow or buffer overflow issues exist. Employ runtime protections such as heap canaries, address space layout randomization (ASLR), and control flow integrity (CFI) to mitigate exploitation impact. Implement strict input validation and sandboxing for processes handling untrusted model files. Regularly audit logs for signs of exploitation attempts and educate developers and users about the risks of opening untrusted files. Finally, maintain an incident response plan tailored to AI infrastructure compromises.
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2026-02-25T03:11:36.689Z
- Cvss Version
- 3.1
- State
- PUBLISHED
Threat ID: 69b30a4f2f860ef943dbd359
Added to database: 3/12/2026, 6:47:43 PM
Last enriched: 3/12/2026, 6:50:43 PM
Last updated: 3/13/2026, 3:35:32 PM
Views: 13
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Actions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
External Links
Need more coverage?
Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.
Latest Threats
Check if your credentials are on the dark web
Instant breach scanning across billions of leaked records. Free tier available.