CVE-2026-31970: CWE-122: Heap-based Buffer Overflow in samtools htslib
HTSlib is a library for reading and writing bioinformatics file formats. GZI files are used to index block-compressed GZIP [BGZF] files. In the GZI loading function, `bgzf_index_load_hfile()`, it was possible to trigger an integer overflow, leading to an under- or zero-sized buffer being allocated to store the index. Sixteen zero bytes would then be written to this buffer, and, depending on the result of the overflow the rest of the file may also be loaded into the buffer as well. If the function did attempt to load the data, it would eventually fail due to not reading the expected number of records, and then try to free the overflowed heap buffer. Exploiting this bug causes a heap buffer overflow. If a user opens a file crafted to exploit this issue, it could lead to the program crashing, or overwriting of data and heap structures in ways not expected by the program. It may be possible to use this to obtain arbitrary code execution. Versions 1.23.1, 1.22.2 and 1.21.1 include fixes for this issue. The easiest work-around is to discard any `.gzi` index files from untrusted sources, and use the `bgzip -r` option to recreate them.
AI Analysis
Technical Summary
CVE-2026-31970 is a heap-based buffer overflow vulnerability identified in the HTSlib library, a widely used component in bioinformatics software such as samtools for reading and writing genomic data file formats. The vulnerability specifically exists in the function bgzf_index_load_hfile(), which loads GZI index files used to index block-compressed GZIP (BGZF) files. An integer overflow occurs during the calculation of the buffer size needed to store the index, resulting in an under- or zero-sized heap buffer allocation. Subsequently, the function writes sixteen zero bytes into this insufficient buffer and may also attempt to load additional file data into it. This leads to a heap buffer overflow, corrupting heap structures and potentially allowing arbitrary code execution. The vulnerability affects HTSlib versions earlier than 1.21.1, versions 1.22 up to but not including 1.22.2, and version 1.23. The flaw can be triggered by opening a maliciously crafted GZI index file, which is typically an ancillary file accompanying compressed genomic data. Exploitation does not require authentication but does require user interaction to open the crafted file. While no active exploits have been reported, the potential for remote code execution and data corruption makes this a critical concern for bioinformatics environments. The recommended mitigation is to discard untrusted .gzi files and regenerate them using the bgzip tool with the -r option, or upgrade to fixed versions 1.21.1, 1.22.2, or 1.23.1. This vulnerability is tracked with a CVSS 4.0 score of 7.1, reflecting high severity due to network attack vector, no privileges required, and high impact on integrity.
Potential Impact
The impact of CVE-2026-31970 is significant for organizations that process genomic or bioinformatics data using HTSlib and samtools. Successful exploitation can lead to program crashes, denial of service, or arbitrary code execution within the context of the vulnerable application. This could allow attackers to manipulate sensitive genomic data, disrupt research workflows, or pivot to further compromise systems in research or healthcare environments. Given the specialized use of HTSlib in bioinformatics, the scope is somewhat limited to organizations in genomics research, clinical diagnostics, pharmaceutical companies, and bioinformatics service providers. However, the sensitive nature of genomic data and the critical role of these tools in healthcare and research amplify the potential damage. Additionally, since the vulnerability can be triggered by opening a crafted file, supply chain risks exist if untrusted or maliciously altered index files are introduced. The lack of known exploits in the wild reduces immediate risk, but the ease of exploitation and high impact warrant urgent remediation.
Mitigation Recommendations
To mitigate CVE-2026-31970, organizations should: 1) Immediately upgrade HTSlib and samtools to versions 1.21.1, 1.22.2, 1.23.1, or later where the vulnerability is patched. 2) Discard any .gzi index files obtained from untrusted or external sources to prevent loading maliciously crafted indexes. 3) Regenerate .gzi files using the bgzip tool with the -r option to ensure index integrity before use. 4) Implement strict file validation and integrity checks on bioinformatics input files, especially those received from third parties or external collaborators. 5) Restrict user permissions and sandbox bioinformatics tools to limit the impact of potential exploitation. 6) Monitor systems for unusual crashes or behavior during file processing that could indicate attempted exploitation. 7) Educate bioinformatics personnel about the risks of opening untrusted genomic data files and enforce secure data handling policies. These steps go beyond generic advice by focusing on the specific file types and workflows involved in this vulnerability.
Affected Countries
United States, United Kingdom, Germany, France, Japan, China, South Korea, Canada, Australia, Netherlands, Switzerland, Sweden, Singapore
CVE-2026-31970: CWE-122: Heap-based Buffer Overflow in samtools htslib
Description
HTSlib is a library for reading and writing bioinformatics file formats. GZI files are used to index block-compressed GZIP [BGZF] files. In the GZI loading function, `bgzf_index_load_hfile()`, it was possible to trigger an integer overflow, leading to an under- or zero-sized buffer being allocated to store the index. Sixteen zero bytes would then be written to this buffer, and, depending on the result of the overflow the rest of the file may also be loaded into the buffer as well. If the function did attempt to load the data, it would eventually fail due to not reading the expected number of records, and then try to free the overflowed heap buffer. Exploiting this bug causes a heap buffer overflow. If a user opens a file crafted to exploit this issue, it could lead to the program crashing, or overwriting of data and heap structures in ways not expected by the program. It may be possible to use this to obtain arbitrary code execution. Versions 1.23.1, 1.22.2 and 1.21.1 include fixes for this issue. The easiest work-around is to discard any `.gzi` index files from untrusted sources, and use the `bgzip -r` option to recreate them.
AI-Powered Analysis
Technical Analysis
CVE-2026-31970 is a heap-based buffer overflow vulnerability identified in the HTSlib library, a widely used component in bioinformatics software such as samtools for reading and writing genomic data file formats. The vulnerability specifically exists in the function bgzf_index_load_hfile(), which loads GZI index files used to index block-compressed GZIP (BGZF) files. An integer overflow occurs during the calculation of the buffer size needed to store the index, resulting in an under- or zero-sized heap buffer allocation. Subsequently, the function writes sixteen zero bytes into this insufficient buffer and may also attempt to load additional file data into it. This leads to a heap buffer overflow, corrupting heap structures and potentially allowing arbitrary code execution. The vulnerability affects HTSlib versions earlier than 1.21.1, versions 1.22 up to but not including 1.22.2, and version 1.23. The flaw can be triggered by opening a maliciously crafted GZI index file, which is typically an ancillary file accompanying compressed genomic data. Exploitation does not require authentication but does require user interaction to open the crafted file. While no active exploits have been reported, the potential for remote code execution and data corruption makes this a critical concern for bioinformatics environments. The recommended mitigation is to discard untrusted .gzi files and regenerate them using the bgzip tool with the -r option, or upgrade to fixed versions 1.21.1, 1.22.2, or 1.23.1. This vulnerability is tracked with a CVSS 4.0 score of 7.1, reflecting high severity due to network attack vector, no privileges required, and high impact on integrity.
Potential Impact
The impact of CVE-2026-31970 is significant for organizations that process genomic or bioinformatics data using HTSlib and samtools. Successful exploitation can lead to program crashes, denial of service, or arbitrary code execution within the context of the vulnerable application. This could allow attackers to manipulate sensitive genomic data, disrupt research workflows, or pivot to further compromise systems in research or healthcare environments. Given the specialized use of HTSlib in bioinformatics, the scope is somewhat limited to organizations in genomics research, clinical diagnostics, pharmaceutical companies, and bioinformatics service providers. However, the sensitive nature of genomic data and the critical role of these tools in healthcare and research amplify the potential damage. Additionally, since the vulnerability can be triggered by opening a crafted file, supply chain risks exist if untrusted or maliciously altered index files are introduced. The lack of known exploits in the wild reduces immediate risk, but the ease of exploitation and high impact warrant urgent remediation.
Mitigation Recommendations
To mitigate CVE-2026-31970, organizations should: 1) Immediately upgrade HTSlib and samtools to versions 1.21.1, 1.22.2, 1.23.1, or later where the vulnerability is patched. 2) Discard any .gzi index files obtained from untrusted or external sources to prevent loading maliciously crafted indexes. 3) Regenerate .gzi files using the bgzip tool with the -r option to ensure index integrity before use. 4) Implement strict file validation and integrity checks on bioinformatics input files, especially those received from third parties or external collaborators. 5) Restrict user permissions and sandbox bioinformatics tools to limit the impact of potential exploitation. 6) Monitor systems for unusual crashes or behavior during file processing that could indicate attempted exploitation. 7) Educate bioinformatics personnel about the risks of opening untrusted genomic data files and enforce secure data handling policies. These steps go beyond generic advice by focusing on the specific file types and workflows involved in this vulnerability.
Technical Details
- Data Version
- 5.2
- Assigner Short Name
- GitHub_M
- Date Reserved
- 2026-03-10T15:40:10.485Z
- Cvss Version
- 4.0
- State
- PUBLISHED
Threat ID: 69bb03e2771bdb1749c142f4
Added to database: 3/18/2026, 7:58:26 PM
Last enriched: 3/18/2026, 8:13:01 PM
Last updated: 3/18/2026, 8:58:53 PM
Views: 5
Community Reviews
0 reviewsCrowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.
Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.
Actions
Updates to AI analysis require Pro Console access. Upgrade inside Console → Billing.
Need more coverage?
Upgrade to Pro Console in Console -> Billing for AI refresh and higher limits.
For incident response and remediation, OffSeq services can help resolve threats faster.
Latest Threats
Check if your credentials are on the dark web
Instant breach scanning across billions of leaked records. Free tier available.