CVE-2025-6920: Missing Authentication for Critical Function in Red Hat Red Hat AI Inference Server

Severity: mediumType: vulnerabilityCVE-2025-6920

A flaw was found in the authentication enforcement mechanism of a model inference API in ai-inference-server. All /v1/* endpoints are expected to enforce API key validation. However, the POST /invocations endpoint failed to do so, resulting in an authentication bypass. This vulnerability allows unauthorized users to access the same inference features available on protected endpoints, potentially exposing sensitive functionality or allowing unintended access to backend resources.

AI Analysis

Technical Summary

CVE-2025-6920 is a medium-severity vulnerability identified in the Red Hat AI Inference Server, specifically affecting its model inference API. The core issue lies in the authentication enforcement mechanism for the API endpoints. While all endpoints under the /v1/* path are designed to require API key validation to restrict access, the POST /invocations endpoint does not enforce this authentication check. This omission results in an authentication bypass, allowing unauthorized users to invoke the inference functionality without valid credentials. The vulnerability potentially exposes sensitive AI inference capabilities to unauthenticated actors, which could lead to unintended access to backend resources or the misuse of AI models. Although the vulnerability does not directly impact data integrity or availability, the confidentiality of the inference process and any sensitive data processed or generated by the AI models is at risk. The CVSS 3.1 base score of 5.3 reflects a network-exploitable flaw with low attack complexity, no privileges required, and no user interaction needed, but with limited impact confined to confidentiality. No known exploits are currently reported in the wild, and no specific affected versions or patches have been detailed yet. This vulnerability highlights the importance of consistent authentication enforcement across all API endpoints, especially in AI services that may handle sensitive or proprietary data and models.

Potential Impact

For European organizations deploying or integrating Red Hat AI Inference Server, this vulnerability could lead to unauthorized access to AI inference capabilities, potentially exposing sensitive intellectual property, proprietary AI models, or confidential data processed by these models. Organizations relying on AI inference for critical decision-making or data analysis might face risks of data leakage or misuse of AI services. Although the vulnerability does not allow modification or disruption of services, unauthorized inference requests could lead to information disclosure or enable attackers to gather insights about the AI models. This could undermine trust in AI-driven applications and services, especially in regulated sectors such as finance, healthcare, or government, where data confidentiality is paramount. Furthermore, unauthorized access might be leveraged as a foothold for further attacks if backend resources are accessible through the inference server. The lack of authentication on a critical API endpoint could also violate compliance requirements under GDPR or other data protection regulations if personal or sensitive data is involved in the AI inference process.

Mitigation Recommendations

To mitigate this vulnerability, European organizations should immediately audit their deployments of Red Hat AI Inference Server to verify whether the POST /invocations endpoint is exposed without authentication. Network-level controls such as firewall rules or API gateways should be configured to restrict access to the inference API endpoints to trusted clients only. Organizations should implement strict API key validation or other authentication mechanisms (e.g., OAuth tokens) on all endpoints, ensuring no exceptions exist. Monitoring and logging of API access should be enhanced to detect any unauthorized or anomalous usage patterns. Until an official patch is released by Red Hat, consider isolating the AI inference server within a secure network segment with limited external exposure. Additionally, conduct a thorough review of the AI models and data processed by the server to assess potential exposure risks. Once patches or updates become available, prioritize their deployment. Finally, update incident response plans to include scenarios involving unauthorized AI inference access.