Reconnecting to live updates…

CVE-2024-53169: Vulnerability in Linux Linux

Severity: highType: vulnerabilityCVE-2024-53169

In the Linux kernel, the following vulnerability has been resolved: nvme-fabrics: fix kernel crash while shutting down controller The nvme keep-alive operation, which executes at a periodic interval, could potentially sneak in while shutting down a fabric controller. This may lead to a race between the fabric controller admin queue destroy code path (invoked while shutting down controller) and hw/hctx queue dispatcher called from the nvme keep-alive async request queuing operation. This race could lead to the kernel crash shown below: Call Trace: autoremove_wake_function+0x0/0xbc (unreliable) __blk_mq_sched_dispatch_requests+0x114/0x24c blk_mq_sched_dispatch_requests+0x44/0x84 blk_mq_run_hw_queue+0x140/0x220 nvme_keep_alive_work+0xc8/0x19c [nvme_core] process_one_work+0x200/0x4e0 worker_thread+0x340/0x504 kthread+0x138/0x140 start_kernel_thread+0x14/0x18 While shutting down fabric controller, if nvme keep-alive request sneaks in then it would be flushed off. The nvme_keep_alive_end_io function is then invoked to handle the end of the keep-alive operation which decrements the admin->q_usage_counter and assuming this is the last/only request in the admin queue then the admin->q_usage_counter becomes zero. If that happens then blk-mq destroy queue operation (blk_mq_destroy_ queue()) which could be potentially running simultaneously on another cpu (as this is the controller shutdown code path) would forward progress and deletes the admin queue. So, now from this point onward we are not supposed to access the admin queue resources. However the issue here's that the nvme keep-alive thread running hw/hctx queue dispatch operation hasn't yet finished its work and so it could still potentially access the admin queue resource while the admin queue had been already deleted and that causes the above crash. The above kernel crash is regression caused due to changes implemented in commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()"). Ideally we should stop keep-alive before destroyin g the admin queue and freeing the admin tagset so that it wouldn't sneak in during the shutdown operation. However we removed the keep alive stop operation from the beginning of the controller shutdown code path in commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()") and added it under nvme_uninit_ctrl() which executes very late in the shutdown code path after the admin queue is destroyed and its tagset is removed. So this change created the possibility of keep-alive sneaking in and interfering with the shutdown operation and causing observed kernel crash. To fix the observed crash, we decided to move nvme_stop_keep_alive() from nvme_uninit_ctrl() to nvme_remove_admin_tag_set(). This change would ensure that we don't forward progress and delete the admin queue until the keep- alive operation is finished (if it's in-flight) or cancelled and that would help contain the race condition explained above and hence avoid the crash. Moving nvme_stop_keep_alive() to nvme_remove_admin_tag_set() instead of adding nvme_stop_keep_alive() to the beginning of the controller shutdown code path in nvme_stop_ctrl(), as was the case earlier before commit a54a93d0e359 ("nvme: move stopping keep-alive into nvme_uninit_ctrl()"), would help save one callsite of nvme_stop_keep_alive().

AI Analysis

Technical Summary

CVE-2024-53169 is a vulnerability in the Linux kernel's NVMe fabrics subsystem that can cause a kernel crash during the shutdown of an NVMe fabric controller. The root cause is a race condition between the NVMe keep-alive operation and the shutdown sequence of the fabric controller's admin queue. Specifically, the keep-alive operation runs periodically and may sneak in while the controller is shutting down. During shutdown, the admin queue is destroyed and its resources freed. However, if the keep-alive operation is still in progress or starts just before the admin queue is destroyed, it may access the now-deleted admin queue resources. This leads to a use-after-free scenario causing a kernel crash. The vulnerability was introduced by a regression in a prior commit (a54a93d0e359) that moved the stopping of the keep-alive operation to a late stage in the shutdown process, after the admin queue was already destroyed. The fix involves moving the stop operation for the keep-alive to an earlier point in the shutdown sequence (specifically to nvme_remove_admin_tag_set()), ensuring that the keep-alive operation is fully stopped or completed before the admin queue is destroyed, thus preventing the race condition and subsequent crash. This vulnerability affects Linux kernel versions containing the specified commits and impacts systems using NVMe over fabrics, which is a networked storage protocol used in high-performance and enterprise environments. There is no indication of exploitation in the wild, and no CVSS score has been assigned yet.

Potential Impact

For European organizations, this vulnerability primarily impacts systems running Linux kernels with NVMe over fabrics support, particularly in data centers, cloud providers, and enterprises using NVMe storage networks for high-performance storage solutions. A kernel crash caused by this race condition can lead to system instability, unexpected reboots, or downtime, potentially disrupting critical services and applications. This can affect confidentiality, integrity, and availability indirectly by causing denial of service conditions. While the vulnerability does not appear to allow privilege escalation or remote code execution, the resulting kernel panic can cause loss of availability and data in-flight. Organizations relying on NVMe fabrics for storage networking in sectors such as finance, telecommunications, healthcare, and critical infrastructure could face operational disruptions. The lack of known exploits reduces immediate risk, but the vulnerability should be addressed promptly to maintain system stability and reliability.

Mitigation Recommendations

European organizations should apply the Linux kernel patches that address this race condition as soon as they are available from their Linux distribution vendors. Specifically, ensure that the kernel version includes the fix that moves the nvme_stop_keep_alive() call to nvme_remove_admin_tag_set(). Until patched, organizations can mitigate risk by minimizing shutdowns or reboots of NVMe fabric controllers and avoiding abrupt shutdowns that might trigger the race. Monitoring kernel logs for nvme-related errors or crashes can help detect attempts to exploit this race condition. Additionally, organizations should review their NVMe fabric controller shutdown procedures to ensure orderly and controlled shutdowns. For critical systems, consider implementing redundancy and failover mechanisms to reduce impact from potential crashes. Coordination with hardware and storage vendors to confirm compatibility with patched kernels is also recommended.

Affected Countries

Germany, France, United Kingdom, Netherlands, Sweden, Finland, Switzerland, Italy

CVE-2024-53169: Vulnerability in Linux Linux

Severity: high

Type: vulnerability

CVE: CVE-2024-53169

Technical Summary

Potential Impact

Mitigation Recommendations

Affected Countries

Germany, France, United Kingdom, Netherlands, Sweden, Finland, Switzerland, Italy

Source: CVE

Published: Fri Dec 27 2024

CVE-2024-53169: Vulnerability in Linux Linux

High

VulnerabilityCVE-2024-53169cve cve-2024-53169

Published: Fri Dec 27 2024 (12/27/2024, 13:49:14 UTC)

Source: CVE

Vendor/Project: Linux

Product: Linux

Description

AI-Powered Analysis

AILast updated: 06/27/2025, 22:27:14 UTC

Technical Analysis

Potential Impact

Mitigation Recommendations

Affected Countries

Need more detailed analysis?Get Pro

Pro Feature

For access to advanced analysis and higher rate limits, contact root@offseq.com

Technical Details

Data Version: 5.1
Assigner Short Name: Linux
Date Reserved: 2024-11-19T17:17:25.005Z
Cisa Enriched: false
Cvss Version: null
State: PUBLISHED

Threat ID: 682d9820c4522896dcbdd056

Added to database: 5/21/2025, 9:08:48 AM

Last enriched: 6/27/2025, 10:27:14 PM

Last updated: 11/28/2025, 2:15:06 PM

Community Reviews

0 reviews

Crowdsource mitigation strategies, share intel context, and vote on the most helpful responses. Sign in to add your voice and help keep defenders ahead.

Sort by

Loading community insights…

Want to contribute mitigation steps or threat intel context? Sign in or create an account to join the community discussion.