Skip to main content

CVE-2025-21825: Vulnerability in Linux Linux

High
VulnerabilityCVE-2025-21825cvecve-2025-21825
Published: Thu Mar 06 2025 (03/06/2025, 16:04:31 UTC)
Source: CVE
Vendor/Project: Linux
Product: Linux

Description

In the Linux kernel, the following vulnerability has been resolved: bpf: Cancel the running bpf_timer through kworker for PREEMPT_RT During the update procedure, when overwrite element in a pre-allocated htab, the freeing of old_element is protected by the bucket lock. The reason why the bucket lock is necessary is that the old_element has already been stashed in htab->extra_elems after alloc_htab_elem() returns. If freeing the old_element after the bucket lock is unlocked, the stashed element may be reused by concurrent update procedure and the freeing of old_element will run concurrently with the reuse of the old_element. However, the invocation of check_and_free_fields() may acquire a spin-lock which violates the lockdep rule because its caller has already held a raw-spin-lock (bucket lock). The following warning will be reported when such race happens: BUG: scheduling while atomic: test_progs/676/0x00000003 3 locks held by test_progs/676: #0: ffffffff864b0240 (rcu_read_lock_trace){....}-{0:0}, at: bpf_prog_test_run_syscall+0x2c0/0x830 #1: ffff88810e961188 (&htab->lockdep_key){....}-{2:2}, at: htab_map_update_elem+0x306/0x1500 #2: ffff8881f4eac1b8 (&base->softirq_expiry_lock){....}-{2:2}, at: hrtimer_cancel_wait_running+0xe9/0x1b0 Modules linked in: bpf_testmod(O) Preemption disabled at: [<ffffffff817837a3>] htab_map_update_elem+0x293/0x1500 CPU: 0 UID: 0 PID: 676 Comm: test_progs Tainted: G ... 6.12.0+ #11 Tainted: [W]=WARN, [O]=OOT_MODULE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)... Call Trace: <TASK> dump_stack_lvl+0x57/0x70 dump_stack+0x10/0x20 __schedule_bug+0x120/0x170 __schedule+0x300c/0x4800 schedule_rtlock+0x37/0x60 rtlock_slowlock_locked+0x6d9/0x54c0 rt_spin_lock+0x168/0x230 hrtimer_cancel_wait_running+0xe9/0x1b0 hrtimer_cancel+0x24/0x30 bpf_timer_delete_work+0x1d/0x40 bpf_timer_cancel_and_free+0x5e/0x80 bpf_obj_free_fields+0x262/0x4a0 check_and_free_fields+0x1d0/0x280 htab_map_update_elem+0x7fc/0x1500 bpf_prog_9f90bc20768e0cb9_overwrite_cb+0x3f/0x43 bpf_prog_ea601c4649694dbd_overwrite_timer+0x5d/0x7e bpf_prog_test_run_syscall+0x322/0x830 __sys_bpf+0x135d/0x3ca0 __x64_sys_bpf+0x75/0xb0 x64_sys_call+0x1b5/0xa10 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 ... </TASK> It seems feasible to break the reuse and refill of per-cpu extra_elems into two independent parts: reuse the per-cpu extra_elems with bucket lock being held and refill the old_element as per-cpu extra_elems after the bucket lock is unlocked. However, it will make the concurrent overwrite procedures on the same CPU return unexpected -E2BIG error when the map is full. Therefore, the patch fixes the lock problem by breaking the cancelling of bpf_timer into two steps for PREEMPT_RT: 1) use hrtimer_try_to_cancel() and check its return value 2) if the timer is running, use hrtimer_cancel() through a kworker to cancel it again Considering that the current implementation of hrtimer_cancel() will try to acquire a being held softirq_expiry_lock when the current timer is running, these steps above are reasonable. However, it also has downside. When the timer is running, the cancelling of the timer is delayed when releasing the last map uref. The delay is also fixable (e.g., break the cancelling of bpf timer into two parts: one part in locked scope, another one in unlocked scope), it can be revised later if necessary. It is a bit hard to decide the right fix tag. One reason is that the problem depends on PREEMPT_RT which is enabled in v6.12. Considering the softirq_expiry_lock lock exists since v5.4 and bpf_timer is introduced in v5.15, the bpf_timer commit is used in the fixes tag and an extra depends-on tag is added to state the dependency on PREEMPT_RT. Depends-on: v6.12+ with PREEMPT_RT enabled

AI-Powered Analysis

AILast updated: 06/30/2025, 09:39:49 UTC

Technical Analysis

CVE-2025-21825 is a vulnerability identified in the Linux kernel specifically affecting the Berkeley Packet Filter (BPF) subsystem's timer cancellation mechanism under the PREEMPT_RT (Real-Time Preemption) configuration. The issue arises during the update procedure of a pre-allocated hash table (htab) element in the BPF map. When overwriting an element, the old element is freed while protected by a bucket lock to prevent concurrent reuse conflicts. However, the function check_and_free_fields() invoked during freeing may acquire a spin-lock, violating the lock dependency rules because the caller already holds a raw-spin-lock (the bucket lock). This leads to a kernel warning indicating scheduling while atomic, which is a serious concurrency bug that can cause kernel instability or crashes. The root cause is a race condition between freeing and reusing elements in the per-CPU extra_elems array, which is exacerbated by the PREEMPT_RT patchset that changes kernel locking and scheduling behavior to support real-time operations. The fix involves splitting the cancellation of the BPF timer into two steps: first attempting a non-blocking cancellation (hrtimer_try_to_cancel()), and if that fails, scheduling a kworker to perform the cancellation asynchronously. This approach avoids holding conflicting locks simultaneously and respects the locking constraints imposed by PREEMPT_RT. Although this fix introduces a delay in timer cancellation when the timer is running, it prevents the concurrency issues and kernel warnings. This vulnerability is specific to Linux kernel versions 6.12 and later with PREEMPT_RT enabled, as the problematic locking and BPF timer features were introduced in these versions. The vulnerability does not appear to have known exploits in the wild yet, but it can cause kernel warnings and potentially lead to system instability or denial of service due to improper locking and scheduling violations in kernel space. The complexity of the issue and its dependency on real-time kernel configurations make it a niche but critical concern for environments using PREEMPT_RT-enabled Linux kernels with BPF timers.

Potential Impact

For European organizations, the impact of CVE-2025-21825 primarily concerns systems running Linux kernels version 6.12 or newer with PREEMPT_RT enabled, which are typically used in real-time or low-latency environments such as telecommunications, industrial control systems, automotive systems, and specialized embedded devices. The vulnerability can lead to kernel warnings, scheduling violations, and potential system instability or crashes, resulting in denial of service conditions. This can disrupt critical infrastructure operations, manufacturing processes, or real-time data processing applications. Confidentiality and integrity impacts are minimal as the vulnerability is a concurrency and locking bug rather than a direct code execution or privilege escalation flaw. However, availability is significantly affected due to the risk of kernel panics or hangs. Organizations relying on real-time Linux kernels for critical operations may experience operational downtime, degraded performance, or increased maintenance overhead. The delayed timer cancellation could also affect timing-sensitive applications, potentially causing unexpected behavior or failures in time-critical systems. Given the specialized nature of PREEMPT_RT usage, the vulnerability's impact is more pronounced in sectors like telecommunications, automotive manufacturing, and industrial automation prevalent in Europe. Disruptions in these sectors could have cascading effects on supply chains and service delivery.

Mitigation Recommendations

1. Update Linux Kernels: European organizations using PREEMPT_RT-enabled Linux kernels should promptly update to the patched kernel versions where this vulnerability is fixed. Monitor Linux kernel mailing lists and vendor advisories for backported patches if using long-term support kernels. 2. Audit Kernel Configurations: Verify whether PREEMPT_RT is enabled on production systems, especially in real-time or embedded environments. Disable PREEMPT_RT if real-time capabilities are not required, as this reduces exposure. 3. Limit BPF Timer Usage: Where possible, restrict or audit the use of BPF programs that utilize timers, particularly in environments with PREEMPT_RT enabled, to minimize triggering the vulnerable code paths. 4. Implement Kernel Locking Best Practices: For organizations developing or maintaining custom kernel modules or BPF programs, ensure adherence to locking rules and avoid holding conflicting locks simultaneously. 5. Monitor System Logs: Deploy monitoring to detect kernel warnings related to scheduling while atomic or lockdep violations, which may indicate attempts to exploit or trigger the vulnerability. 6. Test Updates in Controlled Environments: Given the complexity of PREEMPT_RT and real-time systems, thoroughly test kernel updates in staging environments to ensure stability and performance before production deployment. 7. Collaborate with Vendors: Engage with Linux distribution vendors and hardware manufacturers to obtain timely patches and guidance tailored to specific hardware and software stacks used in critical infrastructure.

Need more detailed analysis?Get Pro

Technical Details

Data Version
5.1
Assigner Short Name
Linux
Date Reserved
2024-12-29T08:45:45.775Z
Cisa Enriched
false
Cvss Version
null
State
PUBLISHED

Threat ID: 682d9832c4522896dcbe8937

Added to database: 5/21/2025, 9:09:06 AM

Last enriched: 6/30/2025, 9:39:49 AM

Last updated: 7/27/2025, 1:03:44 AM

Views: 12

Actions

PRO

Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.

Please log in to the Console to use AI analysis features.

Need enhanced features?

Contact root@offseq.com for Pro access with improved analysis and higher rate limits.

Latest Threats