CVE-2024-43892: Vulnerability in Linux Linux
In the Linux kernel, the following vulnerability has been resolved: memcg: protect concurrent access to mem_cgroup_idr Commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") decoupled the memcg IDs from the CSS ID space to fix the cgroup creation failures. It introduced IDR to maintain the memcg ID space. The IDR depends on external synchronization mechanisms for modifications. For the mem_cgroup_idr, the idr_alloc() and idr_replace() happen within css callback and thus are protected through cgroup_mutex from concurrent modifications. However idr_remove() for mem_cgroup_idr was not protected against concurrency and can be run concurrently for different memcgs when they hit their refcnt to zero. Fix that. We have been seeing list_lru based kernel crashes at a low frequency in our fleet for a long time. These crashes were in different part of list_lru code including list_lru_add(), list_lru_del() and reparenting code. Upon further inspection, it looked like for a given object (dentry and inode), the super_block's list_lru didn't have list_lru_one for the memcg of that object. The initial suspicions were either the object is not allocated through kmem_cache_alloc_lru() or somehow memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg but returned success. No evidence were found for these cases. Looking more deeply, we started seeing situations where valid memcg's id is not present in mem_cgroup_idr and in some cases multiple valid memcgs have same id and mem_cgroup_idr is pointing to one of them. So, the most reasonable explanation is that these situations can happen due to race between multiple idr_remove() calls or race between idr_alloc()/idr_replace() and idr_remove(). These races are causing multiple memcgs to acquire the same ID and then offlining of one of them would cleanup list_lrus on the system for all of them. Later access from other memcgs to the list_lru cause crashes due to missing list_lru_one.
AI Analysis
Technical Summary
CVE-2024-43892 is a concurrency vulnerability in the Linux kernel's memory control group (memcg) subsystem, specifically related to the management of mem_cgroup_idr, an IDR (ID Radix Tree) structure used to track memcg IDs. The vulnerability arises from improper synchronization during the removal of memcg IDs (idr_remove) from the mem_cgroup_idr. While allocation and replacement of IDs (idr_alloc and idr_replace) are protected by the cgroup_mutex, the removal operation was not similarly protected, allowing concurrent idr_remove calls to occur. This race condition can cause multiple memcgs to be assigned the same ID, leading to inconsistent state where the mem_cgroup_idr points to only one of these memcgs. Consequently, when one memcg is offlined and its associated list_lru structures are cleaned up, this cleanup inadvertently affects other memcgs sharing the same ID. Subsequent accesses to these list_lru structures by other memcgs result in kernel crashes due to missing or corrupted list_lru_one entries. This issue has been observed in production environments as sporadic kernel crashes related to list_lru operations, including list_lru_add(), list_lru_del(), and reparenting code. The root cause is a race condition in the IDR management of memcg IDs, leading to data structure corruption and system instability. The fix involves adding proper synchronization to idr_remove operations to prevent concurrent modifications and ensure unique memcg IDs are maintained correctly.
Potential Impact
For European organizations relying on Linux-based systems, especially those using cgroups for resource management in containerized or multi-tenant environments, this vulnerability poses a risk of system instability and unexpected kernel crashes. Such crashes can lead to denial of service conditions, impacting availability of critical services and applications. Environments with heavy use of memory control groups, such as cloud providers, data centers, and enterprises running container orchestration platforms (e.g., Kubernetes) on Linux hosts, are particularly vulnerable. The instability may cause service interruptions, data loss in volatile caches, and increased operational overhead due to unplanned reboots or troubleshooting. While this vulnerability does not directly expose confidentiality or integrity risks, the availability impact can be significant, especially for high-availability systems and infrastructure supporting critical business operations. The lack of known exploits in the wild reduces immediate risk, but the complexity of the bug and its manifestation as sporadic crashes make detection and mitigation challenging without patching.
Mitigation Recommendations
European organizations should prioritize applying the official Linux kernel patches that address this concurrency issue in the memcg subsystem. Specifically, upgrading to kernel versions that include the fix for CVE-2024-43892 is essential. For environments where immediate patching is not feasible, organizations should implement enhanced monitoring of kernel logs for list_lru-related errors and unexpected kernel crashes to detect potential exploitation or manifestation of this bug. Additionally, limiting concurrent creation and destruction of memory cgroups where possible may reduce the likelihood of triggering the race condition. Container orchestration platforms should be configured to gracefully handle node reboots and crashes to minimize service disruption. Testing kernel updates in staging environments before production deployment is recommended to ensure stability. Finally, organizations should maintain up-to-date inventories of Linux kernel versions in use and establish rapid patch management processes for kernel vulnerabilities.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Denmark, Ireland
CVE-2024-43892: Vulnerability in Linux Linux
Description
In the Linux kernel, the following vulnerability has been resolved: memcg: protect concurrent access to mem_cgroup_idr Commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") decoupled the memcg IDs from the CSS ID space to fix the cgroup creation failures. It introduced IDR to maintain the memcg ID space. The IDR depends on external synchronization mechanisms for modifications. For the mem_cgroup_idr, the idr_alloc() and idr_replace() happen within css callback and thus are protected through cgroup_mutex from concurrent modifications. However idr_remove() for mem_cgroup_idr was not protected against concurrency and can be run concurrently for different memcgs when they hit their refcnt to zero. Fix that. We have been seeing list_lru based kernel crashes at a low frequency in our fleet for a long time. These crashes were in different part of list_lru code including list_lru_add(), list_lru_del() and reparenting code. Upon further inspection, it looked like for a given object (dentry and inode), the super_block's list_lru didn't have list_lru_one for the memcg of that object. The initial suspicions were either the object is not allocated through kmem_cache_alloc_lru() or somehow memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg but returned success. No evidence were found for these cases. Looking more deeply, we started seeing situations where valid memcg's id is not present in mem_cgroup_idr and in some cases multiple valid memcgs have same id and mem_cgroup_idr is pointing to one of them. So, the most reasonable explanation is that these situations can happen due to race between multiple idr_remove() calls or race between idr_alloc()/idr_replace() and idr_remove(). These races are causing multiple memcgs to acquire the same ID and then offlining of one of them would cleanup list_lrus on the system for all of them. Later access from other memcgs to the list_lru cause crashes due to missing list_lru_one.
AI-Powered Analysis
Technical Analysis
CVE-2024-43892 is a concurrency vulnerability in the Linux kernel's memory control group (memcg) subsystem, specifically related to the management of mem_cgroup_idr, an IDR (ID Radix Tree) structure used to track memcg IDs. The vulnerability arises from improper synchronization during the removal of memcg IDs (idr_remove) from the mem_cgroup_idr. While allocation and replacement of IDs (idr_alloc and idr_replace) are protected by the cgroup_mutex, the removal operation was not similarly protected, allowing concurrent idr_remove calls to occur. This race condition can cause multiple memcgs to be assigned the same ID, leading to inconsistent state where the mem_cgroup_idr points to only one of these memcgs. Consequently, when one memcg is offlined and its associated list_lru structures are cleaned up, this cleanup inadvertently affects other memcgs sharing the same ID. Subsequent accesses to these list_lru structures by other memcgs result in kernel crashes due to missing or corrupted list_lru_one entries. This issue has been observed in production environments as sporadic kernel crashes related to list_lru operations, including list_lru_add(), list_lru_del(), and reparenting code. The root cause is a race condition in the IDR management of memcg IDs, leading to data structure corruption and system instability. The fix involves adding proper synchronization to idr_remove operations to prevent concurrent modifications and ensure unique memcg IDs are maintained correctly.
Potential Impact
For European organizations relying on Linux-based systems, especially those using cgroups for resource management in containerized or multi-tenant environments, this vulnerability poses a risk of system instability and unexpected kernel crashes. Such crashes can lead to denial of service conditions, impacting availability of critical services and applications. Environments with heavy use of memory control groups, such as cloud providers, data centers, and enterprises running container orchestration platforms (e.g., Kubernetes) on Linux hosts, are particularly vulnerable. The instability may cause service interruptions, data loss in volatile caches, and increased operational overhead due to unplanned reboots or troubleshooting. While this vulnerability does not directly expose confidentiality or integrity risks, the availability impact can be significant, especially for high-availability systems and infrastructure supporting critical business operations. The lack of known exploits in the wild reduces immediate risk, but the complexity of the bug and its manifestation as sporadic crashes make detection and mitigation challenging without patching.
Mitigation Recommendations
European organizations should prioritize applying the official Linux kernel patches that address this concurrency issue in the memcg subsystem. Specifically, upgrading to kernel versions that include the fix for CVE-2024-43892 is essential. For environments where immediate patching is not feasible, organizations should implement enhanced monitoring of kernel logs for list_lru-related errors and unexpected kernel crashes to detect potential exploitation or manifestation of this bug. Additionally, limiting concurrent creation and destruction of memory cgroups where possible may reduce the likelihood of triggering the race condition. Container orchestration platforms should be configured to gracefully handle node reboots and crashes to minimize service disruption. Testing kernel updates in staging environments before production deployment is recommended to ensure stability. Finally, organizations should maintain up-to-date inventories of Linux kernel versions in use and establish rapid patch management processes for kernel vulnerabilities.
Affected Countries
For access to advanced analysis and higher rate limits, contact root@offseq.com
Technical Details
- Data Version
- 5.1
- Assigner Short Name
- Linux
- Date Reserved
- 2024-08-17T09:11:59.290Z
- Cisa Enriched
- true
- Cvss Version
- null
- State
- PUBLISHED
Threat ID: 682d9820c4522896dcbdcd74
Added to database: 5/21/2025, 9:08:48 AM
Last enriched: 6/27/2025, 9:09:44 PM
Last updated: 8/22/2025, 1:32:38 AM
Views: 15
Related Threats
CVE-2025-41452: CWE-15: External Control of System or Configuration Setting in Danfoss AK-SM8xxA Series
MediumCVE-2025-41451: CWE-77 Improper Neutralization of Special Elements used in a Command ('Command Injection') in Danfoss AK-SM8xxA Series
HighCVE-2025-43752: CWE-770 Allocation of Resources Without Limits or Throttling in Liferay Portal
MediumCVE-2025-43753: CWE-79 Improper Neutralization of Input During Web Page Generation (XSS or 'Cross-site Scripting') in Liferay Portal
LowCVE-2025-51606: n/a
UnknownActions
Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.
External Links
Need enhanced features?
Contact root@offseq.com for Pro access with improved analysis and higher rate limits.