CVE-2025-21880: Vulnerability in Linux Linux
In the Linux kernel, the following vulnerability has been resolved: drm/xe/userptr: fix EFAULT handling Currently we treat EFAULT from hmm_range_fault() as a non-fatal error when called from xe_vm_userptr_pin() with the idea that we want to avoid killing the entire vm and chucking an error, under the assumption that the user just did an unmap or something, and has no intention of actually touching that memory from the GPU. At this point we have already zapped the PTEs so any access should generate a page fault, and if the pin fails there also it will then become fatal. However it looks like it's possible for the userptr vma to still be on the rebind list in preempt_rebind_work_func(), if we had to retry the pin again due to something happening in the caller before we did the rebind step, but in the meantime needing to re-validate the userptr and this time hitting the EFAULT. This explains an internal user report of hitting: [ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] <TASK> [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe] Followed by NPD, when running some workload, since the sg was never actually populated but the vma is still marked for rebind when it should be skipped for this special EFAULT case. This is confirmed to fix the user report. v2 (MattB): - Move earlier. v3 (MattB): - Update the commit message to make it clear that this indeed fixes the issue. (cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618)
AI Analysis
Technical Summary
CVE-2025-21880 is a vulnerability identified in the Linux kernel specifically within the Direct Rendering Manager (DRM) subsystem for the Intel Xe graphics driver (xe). The issue arises in the handling of EFAULT errors returned by the hmm_range_fault() function during user pointer (userptr) memory pinning operations in the GPU driver. The vulnerability stems from the kernel treating EFAULT as a non-fatal error when it occurs in xe_vm_userptr_pin(), under the assumption that the user has unmapped the memory and does not intend to access it via the GPU. However, due to a race condition or timing issue, the userptr virtual memory area (VMA) can remain on the rebind list in the preempt_rebind_work_func() workqueue function even after an EFAULT error, leading to attempts to rebind memory that was never properly pinned. This results in a kernel warning and can cause a null pointer dereference (NPD) or kernel panic during workloads that trigger this condition. The root cause is that the scatter-gather list (sg) is never populated, but the VMA is still marked for rebind, which should be skipped in the special EFAULT case. The fix involves adjusting the error handling logic to ensure that the VMA is correctly removed from the rebind list when an EFAULT occurs, preventing the kernel from attempting invalid memory operations. This vulnerability is specific to certain Linux kernel versions containing the affected commit (521db22a1d70dbc596a07544a738416025b1b63c) and affects systems using the Intel Xe GPU driver with userptr functionality enabled. The issue was reported internally and addressed by a patch that was backported and clarified in subsequent commits.
Potential Impact
For European organizations, the impact of CVE-2025-21880 primarily concerns systems running Linux kernels with the affected Intel Xe graphics driver versions, especially in environments where GPU user pointer operations are utilized. This includes data centers, cloud providers, research institutions, and enterprises relying on GPU-accelerated workloads such as AI/ML, scientific computing, or graphics rendering. The vulnerability can lead to kernel warnings, null pointer dereferences, and potential system crashes (kernel panics), resulting in denial of service (DoS) conditions. Such disruptions can affect availability of critical services, cause data loss if unsaved work is interrupted, and increase operational costs due to downtime and recovery efforts. While the vulnerability does not appear to allow privilege escalation or remote code execution, the instability it causes can be exploited by local users or malicious processes to degrade system reliability. In regulated industries or sectors with strict uptime requirements (e.g., finance, healthcare, manufacturing), these disruptions can have significant operational and compliance consequences. Additionally, the lack of known exploits in the wild reduces immediate risk but does not eliminate the threat, especially as attackers may develop exploits targeting this flaw in the future.
Mitigation Recommendations
European organizations should apply the official Linux kernel patches that address CVE-2025-21880 as soon as they become available from their Linux distribution vendors or the upstream kernel maintainers. Specifically, ensure that the kernel version includes the commit 6b93cb98910c826c2e2004942f8b060311e43618 or later. For environments where immediate patching is not feasible, consider the following mitigations: 1) Disable or restrict use of the Intel Xe GPU userptr functionality if it is not required, thereby reducing the attack surface. 2) Implement strict access controls and monitoring on systems with GPU workloads to detect abnormal kernel warnings or crashes related to DRM or GPU drivers. 3) Use kernel live patching solutions where supported to apply fixes without full system reboots, minimizing downtime. 4) Conduct thorough testing of GPU workloads after patching to ensure stability and performance are maintained. 5) Maintain up-to-date backups and incident response plans to quickly recover from potential DoS incidents caused by this vulnerability. 6) Collaborate with hardware and software vendors to receive timely updates and advisories related to GPU driver vulnerabilities.
Affected Countries
Germany, France, United Kingdom, Netherlands, Sweden, Finland, Denmark, Ireland, Belgium, Italy
CVE-2025-21880: Vulnerability in Linux Linux
Description
In the Linux kernel, the following vulnerability has been resolved: drm/xe/userptr: fix EFAULT handling Currently we treat EFAULT from hmm_range_fault() as a non-fatal error when called from xe_vm_userptr_pin() with the idea that we want to avoid killing the entire vm and chucking an error, under the assumption that the user just did an unmap or something, and has no intention of actually touching that memory from the GPU. At this point we have already zapped the PTEs so any access should generate a page fault, and if the pin fails there also it will then become fatal. However it looks like it's possible for the userptr vma to still be on the rebind list in preempt_rebind_work_func(), if we had to retry the pin again due to something happening in the caller before we did the rebind step, but in the meantime needing to re-validate the userptr and this time hitting the EFAULT. This explains an internal user report of hitting: [ 191.738349] WARNING: CPU: 1 PID: 157 at drivers/gpu/drm/xe/xe_res_cursor.h:158 xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738551] Workqueue: xe-ordered-wq preempt_rebind_work_func [xe] [ 191.738616] RIP: 0010:xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738690] Call Trace: [ 191.738692] <TASK> [ 191.738694] ? show_regs+0x69/0x80 [ 191.738698] ? __warn+0x93/0x1a0 [ 191.738703] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738759] ? report_bug+0x18f/0x1a0 [ 191.738764] ? handle_bug+0x63/0xa0 [ 191.738767] ? exc_invalid_op+0x19/0x70 [ 191.738770] ? asm_exc_invalid_op+0x1b/0x20 [ 191.738777] ? xe_pt_stage_bind.constprop.0+0x60a/0x6b0 [xe] [ 191.738834] ? ret_from_fork_asm+0x1a/0x30 [ 191.738849] bind_op_prepare+0x105/0x7b0 [xe] [ 191.738906] ? dma_resv_reserve_fences+0x301/0x380 [ 191.738912] xe_pt_update_ops_prepare+0x28c/0x4b0 [xe] [ 191.738966] ? kmemleak_alloc+0x4b/0x80 [ 191.738973] ops_execute+0x188/0x9d0 [xe] [ 191.739036] xe_vm_rebind+0x4ce/0x5a0 [xe] [ 191.739098] ? trace_hardirqs_on+0x4d/0x60 [ 191.739112] preempt_rebind_work_func+0x76f/0xd00 [xe] Followed by NPD, when running some workload, since the sg was never actually populated but the vma is still marked for rebind when it should be skipped for this special EFAULT case. This is confirmed to fix the user report. v2 (MattB): - Move earlier. v3 (MattB): - Update the commit message to make it clear that this indeed fixes the issue. (cherry picked from commit 6b93cb98910c826c2e2004942f8b060311e43618)
AI-Powered Analysis
Technical Analysis
CVE-2025-21880 is a vulnerability identified in the Linux kernel specifically within the Direct Rendering Manager (DRM) subsystem for the Intel Xe graphics driver (xe). The issue arises in the handling of EFAULT errors returned by the hmm_range_fault() function during user pointer (userptr) memory pinning operations in the GPU driver. The vulnerability stems from the kernel treating EFAULT as a non-fatal error when it occurs in xe_vm_userptr_pin(), under the assumption that the user has unmapped the memory and does not intend to access it via the GPU. However, due to a race condition or timing issue, the userptr virtual memory area (VMA) can remain on the rebind list in the preempt_rebind_work_func() workqueue function even after an EFAULT error, leading to attempts to rebind memory that was never properly pinned. This results in a kernel warning and can cause a null pointer dereference (NPD) or kernel panic during workloads that trigger this condition. The root cause is that the scatter-gather list (sg) is never populated, but the VMA is still marked for rebind, which should be skipped in the special EFAULT case. The fix involves adjusting the error handling logic to ensure that the VMA is correctly removed from the rebind list when an EFAULT occurs, preventing the kernel from attempting invalid memory operations. This vulnerability is specific to certain Linux kernel versions containing the affected commit (521db22a1d70dbc596a07544a738416025b1b63c) and affects systems using the Intel Xe GPU driver with userptr functionality enabled. The issue was reported internally and addressed by a patch that was backported and clarified in subsequent commits.
Potential Impact
For European organizations, the impact of CVE-2025-21880 primarily concerns systems running Linux kernels with the affected Intel Xe graphics driver versions, especially in environments where GPU user pointer operations are utilized. This includes data centers, cloud providers, research institutions, and enterprises relying on GPU-accelerated workloads such as AI/ML, scientific computing, or graphics rendering. The vulnerability can lead to kernel warnings, null pointer dereferences, and potential system crashes (kernel panics), resulting in denial of service (DoS) conditions. Such disruptions can affect availability of critical services, cause data loss if unsaved work is interrupted, and increase operational costs due to downtime and recovery efforts. While the vulnerability does not appear to allow privilege escalation or remote code execution, the instability it causes can be exploited by local users or malicious processes to degrade system reliability. In regulated industries or sectors with strict uptime requirements (e.g., finance, healthcare, manufacturing), these disruptions can have significant operational and compliance consequences. Additionally, the lack of known exploits in the wild reduces immediate risk but does not eliminate the threat, especially as attackers may develop exploits targeting this flaw in the future.
Mitigation Recommendations
European organizations should apply the official Linux kernel patches that address CVE-2025-21880 as soon as they become available from their Linux distribution vendors or the upstream kernel maintainers. Specifically, ensure that the kernel version includes the commit 6b93cb98910c826c2e2004942f8b060311e43618 or later. For environments where immediate patching is not feasible, consider the following mitigations: 1) Disable or restrict use of the Intel Xe GPU userptr functionality if it is not required, thereby reducing the attack surface. 2) Implement strict access controls and monitoring on systems with GPU workloads to detect abnormal kernel warnings or crashes related to DRM or GPU drivers. 3) Use kernel live patching solutions where supported to apply fixes without full system reboots, minimizing downtime. 4) Conduct thorough testing of GPU workloads after patching to ensure stability and performance are maintained. 5) Maintain up-to-date backups and incident response plans to quickly recover from potential DoS incidents caused by this vulnerability. 6) Collaborate with hardware and software vendors to receive timely updates and advisories related to GPU driver vulnerabilities.
For access to advanced analysis and higher rate limits, contact root@offseq.com
Technical Details
- Data Version
- 5.1
- Assigner Short Name
- Linux
- Date Reserved
- 2024-12-29T08:45:45.782Z
- Cisa Enriched
- false
- Cvss Version
- null
- State
- PUBLISHED
Threat ID: 682d9832c4522896dcbe8ac3
Added to database: 5/21/2025, 9:09:06 AM
Last enriched: 6/30/2025, 10:13:14 AM
Last updated: 8/1/2025, 12:41:11 AM
Views: 14
Related Threats
CVE-2025-9096: Cross Site Scripting in ExpressGateway express-gateway
MediumCVE-2025-9095: Cross Site Scripting in ExpressGateway express-gateway
MediumCVE-2025-7342: CWE-798 Use of Hard-coded Credentials in Kubernetes Image Builder
HighCVE-2025-9094: Improper Neutralization of Special Elements Used in a Template Engine in ThingsBoard
MediumCVE-2025-9093: Improper Export of Android Application Components in BuzzFeed App
MediumActions
Updates to AI analysis are available only with a Pro account. Contact root@offseq.com for access.
External Links
Need enhanced features?
Contact root@offseq.com for Pro access with improved analysis and higher rate limits.