CVE-2024-40906: Vulnerability in Linux Linux
In the Linux kernel, the following vulnerability has been resolved: net/mlx5: Always stop health timer during driver removal Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed. Hence, stop the health monitor even if teardown_hca fails. [1] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: cleanup mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x175/0x180 ? asm_exc_page_fault+0x26/0x30 ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? ioread32be+0x34/0x60 mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] poll_health+0x42/0x230 [mlx5_core] ? __next_timer_interrupt+0xbc/0x110 ? __pfx_poll_health+0x10/0x10 [mlx5_core] call_timer_fn+0x21/0x130 ? __pfx_poll_health+0x10/0x10 [mlx5_core] __run_timers+0x222/0x2c0 run_timer_softirq+0x1d/0x40 __do_softirq+0xc9/0x2c8 __irq_exit_rcu+0xa6/0xc0 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:cpuidle_enter_state+0xcc/0x440 ? cpuidle_enter_state+0xbd/0x440 cpuidle_enter+0x2d/0x40 do_idle+0x20d/0x270 cpu_startup_entry+0x2a/0x30 rest_init+0xd0/0xd0 arch_call_rest_init+0xe/0x30 start_kernel+0x709/0xa90 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x96/0xa0 secondary_startup_64_no_verify+0x18f/0x19b ---[ end trace 0000000000000000 ]---
AI Analysis
Technical Summary
CVE-2024-40906 is a vulnerability identified in the Linux kernel's mlx5 driver, which manages Mellanox network interface cards (NICs). The issue arises during the driver removal process, specifically in the teardown_hca function responsible for hardware teardown. If teardown_hca fails, the mlx5 driver does not stop the health timer as it should. This health timer continues to run and attempts to access resources that have already been freed, leading to a use-after-free (UAF) condition. The UAF triggers a page fault (kernel oops), causing instability or crashes in the kernel. The vulnerability is rooted in improper cleanup sequencing, where the health monitor timer is not halted even when the teardown process encounters errors. The provided kernel logs illustrate the fault occurring during the health timer callback, with a stack trace pointing to mlx5_core functions and ultimately causing a kernel panic. This flaw can lead to system crashes or denial of service (DoS) due to kernel instability. Although no known exploits are reported in the wild, the vulnerability affects Linux kernel versions incorporating the vulnerable mlx5 driver code. The mlx5 driver is critical for high-performance networking environments, especially in data centers and enterprise networks using Mellanox hardware. The patch involves ensuring the health timer is stopped regardless of teardown_hca success, preventing the UAF condition and subsequent kernel faults.
Potential Impact
For European organizations, the impact of CVE-2024-40906 can be significant, particularly for enterprises and data centers relying on Linux servers equipped with Mellanox NICs for high-speed networking. The vulnerability can cause kernel crashes leading to system downtime, impacting availability of critical services such as cloud infrastructure, financial transaction systems, telecommunications, and research computing clusters. This can result in operational disruptions, potential data loss during crashes, and increased maintenance overhead. While the vulnerability does not directly expose confidentiality or integrity risks, the resulting denial of service can degrade business continuity and service level agreements (SLAs). Organizations in sectors with stringent uptime requirements, such as finance, healthcare, and critical infrastructure, may face heightened risks. Additionally, the complexity of the issue means that recovery from crashes may require manual intervention or system reboots, increasing operational costs. Since no active exploits are known, the immediate threat is moderate, but the potential for future exploitation or triggering by faulty driver removal processes remains a concern.
Mitigation Recommendations
To mitigate CVE-2024-40906, European organizations should: 1) Apply the latest Linux kernel updates and patches that address this vulnerability as soon as they become available from trusted Linux distributions (e.g., Red Hat, SUSE, Ubuntu). 2) Monitor and audit systems using Mellanox mlx5 drivers for unusual kernel oops or crashes related to driver removal or hardware teardown. 3) Avoid unnecessary unloading or reloading of the mlx5 driver in production environments to reduce exposure to the vulnerable code path. 4) Implement robust system monitoring and alerting to detect early signs of kernel instability or health timer faults. 5) For environments where patching is delayed, consider isolating or limiting the use of affected Mellanox NICs in critical systems or deploying fallback network interfaces. 6) Engage with hardware and OS vendors to confirm the presence of fixes and best practices for driver management. 7) Conduct thorough testing of kernel upgrades in staging environments to ensure stability before production deployment. These steps go beyond generic advice by focusing on driver-specific handling, proactive monitoring, and operational controls to minimize risk until patches are fully deployed.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Italy
CVE-2024-40906: Vulnerability in Linux Linux
Description
In the Linux kernel, the following vulnerability has been resolved: net/mlx5: Always stop health timer during driver removal Currently, if teardown_hca fails to execute during driver removal, mlx5 does not stop the health timer. Afterwards, mlx5 continue with driver teardown. This may lead to a UAF bug, which results in page fault Oops[1], since the health timer invokes after resources were freed. Hence, stop the health monitor even if teardown_hca fails. [1] mlx5_core 0000:18:00.0: E-Switch: Unload vfs: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: Disable: mode(LEGACY), nvfs(0), necvfs(0), active vports(0) mlx5_core 0000:18:00.0: E-Switch: cleanup mlx5_core 0000:18:00.0: wait_func:1155:(pid 1967079): TEARDOWN_HCA(0x103) timeout. Will cause a leak of a command resource mlx5_core 0000:18:00.0: mlx5_function_close:1288:(pid 1967079): tear_down_hca failed, skip cleanup BUG: unable to handle page fault for address: ffffa26487064230 PGD 100c00067 P4D 100c00067 PUD 100e5a067 PMD 105ed7067 PTE 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Tainted: G OE ------- --- 6.7.0-68.fc38.x86_64 #1 Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0013.121520200651 12/15/2020 RIP: 0010:ioread32be+0x34/0x60 RSP: 0018:ffffa26480003e58 EFLAGS: 00010292 RAX: ffffa26487064200 RBX: ffff9042d08161a0 RCX: ffff904c108222c0 RDX: 000000010bbf1b80 RSI: ffffffffc055ddb0 RDI: ffffa26487064230 RBP: ffff9042d08161a0 R08: 0000000000000022 R09: ffff904c108222e8 R10: 0000000000000004 R11: 0000000000000441 R12: ffffffffc055ddb0 R13: ffffa26487064200 R14: ffffa26480003f00 R15: ffff904c108222c0 FS: 0000000000000000(0000) GS:ffff904c10800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa26487064230 CR3: 00000002c4420006 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? __die+0x23/0x70 ? page_fault_oops+0x171/0x4e0 ? exc_page_fault+0x175/0x180 ? asm_exc_page_fault+0x26/0x30 ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] ? ioread32be+0x34/0x60 mlx5_health_check_fatal_sensors+0x20/0x100 [mlx5_core] ? __pfx_poll_health+0x10/0x10 [mlx5_core] poll_health+0x42/0x230 [mlx5_core] ? __next_timer_interrupt+0xbc/0x110 ? __pfx_poll_health+0x10/0x10 [mlx5_core] call_timer_fn+0x21/0x130 ? __pfx_poll_health+0x10/0x10 [mlx5_core] __run_timers+0x222/0x2c0 run_timer_softirq+0x1d/0x40 __do_softirq+0xc9/0x2c8 __irq_exit_rcu+0xa6/0xc0 sysvec_apic_timer_interrupt+0x72/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:cpuidle_enter_state+0xcc/0x440 ? cpuidle_enter_state+0xbd/0x440 cpuidle_enter+0x2d/0x40 do_idle+0x20d/0x270 cpu_startup_entry+0x2a/0x30 rest_init+0xd0/0xd0 arch_call_rest_init+0xe/0x30 start_kernel+0x709/0xa90 x86_64_start_reservations+0x18/0x30 x86_64_start_kernel+0x96/0xa0 secondary_startup_64_no_verify+0x18f/0x19b ---[ end trace 0000000000000000 ]---
AI-Powered Analysis
Technical Analysis
CVE-2024-40906 is a vulnerability identified in the Linux kernel's mlx5 driver, which manages Mellanox network interface cards (NICs). The issue arises during the driver removal process, specifically in the teardown_hca function responsible for hardware teardown. If teardown_hca fails, the mlx5 driver does not stop the health timer as it should. This health timer continues to run and attempts to access resources that have already been freed, leading to a use-after-free (UAF) condition. The UAF triggers a page fault (kernel oops), causing instability or crashes in the kernel. The vulnerability is rooted in improper cleanup sequencing, where the health monitor timer is not halted even when the teardown process encounters errors. The provided kernel logs illustrate the fault occurring during the health timer callback, with a stack trace pointing to mlx5_core functions and ultimately causing a kernel panic. This flaw can lead to system crashes or denial of service (DoS) due to kernel instability. Although no known exploits are reported in the wild, the vulnerability affects Linux kernel versions incorporating the vulnerable mlx5 driver code. The mlx5 driver is critical for high-performance networking environments, especially in data centers and enterprise networks using Mellanox hardware. The patch involves ensuring the health timer is stopped regardless of teardown_hca success, preventing the UAF condition and subsequent kernel faults.
Potential Impact
For European organizations, the impact of CVE-2024-40906 can be significant, particularly for enterprises and data centers relying on Linux servers equipped with Mellanox NICs for high-speed networking. The vulnerability can cause kernel crashes leading to system downtime, impacting availability of critical services such as cloud infrastructure, financial transaction systems, telecommunications, and research computing clusters. This can result in operational disruptions, potential data loss during crashes, and increased maintenance overhead. While the vulnerability does not directly expose confidentiality or integrity risks, the resulting denial of service can degrade business continuity and service level agreements (SLAs). Organizations in sectors with stringent uptime requirements, such as finance, healthcare, and critical infrastructure, may face heightened risks. Additionally, the complexity of the issue means that recovery from crashes may require manual intervention or system reboots, increasing operational costs. Since no active exploits are known, the immediate threat is moderate, but the potential for future exploitation or triggering by faulty driver removal processes remains a concern.
Mitigation Recommendations
To mitigate CVE-2024-40906, European organizations should: 1) Apply the latest Linux kernel updates and patches that address this vulnerability as soon as they become available from trusted Linux distributions (e.g., Red Hat, SUSE, Ubuntu). 2) Monitor and audit systems using Mellanox mlx5 drivers for unusual kernel oops or crashes related to driver removal or hardware teardown. 3) Avoid unnecessary unloading or reloading of the mlx5 driver in production environments to reduce exposure to the vulnerable code path. 4) Implement robust system monitoring and alerting to detect early signs of kernel instability or health timer faults. 5) For environments where patching is delayed, consider isolating or limiting the use of affected Mellanox NICs in critical systems or deploying fallback network interfaces. 6) Engage with hardware and OS vendors to confirm the presence of fixes and best practices for driver management. 7) Conduct thorough testing of kernel upgrades in staging environments to ensure stability before production deployment. These steps go beyond generic advice by focusing on driver-specific handling, proactive monitoring, and operational controls to minimize risk until patches are fully deployed.
Affected Countries
For access to advanced analysis and higher rate limits, contact root@offseq.com
Technical Details
- Data Version
- 5.1
- Assigner Short Name
- Linux
- Date Reserved
- 2024-07-12T12:17:45.580Z
- Cisa Enriched
- true
- Cvss Version
- null
- State
- PUBLISHED
Threat ID: 682d9827c4522896dcbe137d
Added to database: 5/21/2025, 9:08:55 AM
Last enriched: 6/29/2025, 2:09:47 AM
Last updated: 8/6/2025, 11:52:34 AM
Views: 15
Related Threats
CVE-2025-8743: Cross Site Scripting in Scada-LTS
MediumCVE-2025-8742: Improper Restriction of Excessive Authentication Attempts in macrozheng mall
MediumCVE-2025-8741: Cleartext Transmission of Sensitive Information in macrozheng mall
MediumCVE-2025-8740: Cross Site Scripting in zhenfeng13 My-Blog
MediumCVE-2025-8739: Cross-Site Request Forgery in zhenfeng13 My-Blog
MediumActions
Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.
External Links
Need enhanced features?
Contact root@offseq.com for Pro access with improved analysis and higher rate limits.