LiteLLM AI Gateway: Active Exploitation via MCP Injection

Authors: Cloud Security Alliance AI Safety Initiative
Published: 2026-06-13

Categories: AI Infrastructure Security, Vulnerability Intelligence, Supply Chain Security

Download PDF

Key Takeaways

CVE-2026-42271 is a high-severity command injection vulnerability (CVSS 8.7) in LiteLLM, a widely deployed open-source AI gateway and proxy server, affecting all versions from 1.74.2 through 1.83.6.
The vulnerability resides in MCP test endpoints that spawn user-supplied subprocess configurations on the proxy host with no command allowlist or administrative role gate, allowing any authenticated holder of a proxy API key to execute arbitrary commands on the underlying system.
Researchers at Horizon3.ai demonstrated that CVE-2026-42271 can be chained with CVE-2026-48710 — a Host header authentication bypass in the Starlette web framework known as BadHost — to achieve unauthenticated remote code execution from any network-reachable host. The combined attack chain carries a CVSS score of 10.0 [1].
CISA added CVE-2026-42271 to its Known Exploited Vulnerabilities (KEV) catalog on June 8, 2026, after confirming active in-the-wild exploitation [2]. Federal civilian agencies are subject to mandatory remediation timelines under Binding Operational Directive 26-04 [14].
The authorized fix requires upgrading LiteLLM to version 1.83.7 and Starlette to version 1.0.1. Organizations running any earlier combination are exposed to the full unauthenticated RCE chain.
LiteLLM’s role as a centralized proxy for LLM API traffic means successful exploitation exposes not only the gateway host but the full set of model provider API keys, usage logs, and downstream AI infrastructure connected through the proxy.

Background

LiteLLM is an open-source Python library and proxy server developed by BerriAI that provides a unified, OpenAI-compatible interface for routing requests across more than one hundred large language model providers, including OpenAI, Anthropic, Google Vertex, Cohere, and Azure [3]. Organizations adopt LiteLLM as an AI gateway to centralize model access, enforce usage policies and spend controls, apply rate limiting, and maintain consistent logging and observability across a heterogeneous portfolio of AI services. LiteLLM’s proxy mode exposes an HTTP API that translates incoming requests to provider-specific formats before forwarding them, and this deployment pattern is the most common in enterprise environments.

The gateway pattern creates a concentrated credential surface. A production LiteLLM deployment typically holds API keys for every LLM provider it routes to, database credentials for its usage logging store, and configuration controlling which teams and services can access which models. This concentration of sensitive material, combined with LiteLLM’s position on the network path between internal AI consumers and external model providers, makes it a high-value target. A compromised LiteLLM instance gives an attacker both the credentials to impersonate legitimate AI workloads across all connected providers and the ability to observe or manipulate every request flowing through the proxy.

LiteLLM has become one of the more widely adopted AI gateway libraries, with reported integrations across AI agent frameworks, orchestration platforms, and enterprise AI infrastructure stacks. The breadth of its deployment amplifies the consequence of any security flaw: a vulnerability in LiteLLM has the potential to affect organizations across many sectors simultaneously, compressing the window between disclosure and mass exploitation.

Separately from the RCE vulnerability addressed in this note, LiteLLM experienced a supply chain compromise in March 2026. Two PyPI releases — versions 1.82.7 and 1.82.8 — were published by a threat actor group known as TeamPCP after the group obtained a maintainer’s PyPI publishing credentials [4]. The malicious releases were live for approximately forty minutes before PyPI quarantined them, but organizations that installed either version during that window may have received a three-stage payload designed to harvest credentials, perform Kubernetes lateral movement, and install a persistent systemd backdoor [5][12]. Datadog Security Labs linked this campaign to a broader series of supply chain attacks that also targeted the Trivy container security scanner and the Checkmarx KICS policy engine [5][12]. While the supply chain incident and CVE-2026-42271 are distinct events, their proximity underscores a sustained pattern of adversarial focus on LiteLLM as a high-value target within AI infrastructure, and organizations assessing their LiteLLM exposure should evaluate both attack vectors.

Security Analysis

CVE-2026-42271: Command Injection via MCP Test Endpoints

The vulnerability resides in two endpoints introduced to support LiteLLM’s Model Context Protocol (MCP) integration: POST /mcp-rest/test/connection and POST /mcp-rest/test/tools/list. These endpoints were designed to allow proxy operators to verify MCP server connectivity by supplying a server configuration inline in the request body. The configuration format mirrors the MCP stdio transport specification, accepting command, args, and env fields that together describe a subprocess to be launched on the proxy host.

The root cause is a combination of missing authorization enforcement and unsafe handling of caller-controlled process configuration. The endpoints pass the supplied configuration directly to a subprocess invocation with no allowlist of permitted commands, no restriction to an administrative role, and no sandbox isolating the spawned process from the host operating system [1][6]. Any user holding a valid LiteLLM proxy API key — a credential commonly distributed across developer teams and automated services to enable model access — can submit an arbitrary executable path and argument array and cause the proxy to execute it. The CVSS 8.7 score reflects the high impact of code execution on the proxy host and the low exploit complexity, with only the initial authentication requirement preventing it from scoring critical [6].

Successful exploitation of CVE-2026-42271 alone allows an authenticated attacker to execute arbitrary commands as the LiteLLM proxy process, read environment variables containing provider API keys injected at container startup, access credential files mounted into the proxy container, and extract LiteLLM’s configuration and database contents. In cloud-native deployments where the proxy runs under a Kubernetes service account, the attack surface extends to the cluster control plane if the service account holds cluster-scoped permissions [1].

The BadHost Amplifier: Unauthenticated RCE via CVE-2026-48710

Standing alone, CVE-2026-42271 requires a valid API key — which limits initial exploitation to insiders and attackers who have already obtained a proxy credential by other means. The threat profile changes fundamentally when CVE-2026-42271 is combined with CVE-2026-48710, a Host header authentication bypass in the Starlette ASGI web framework documented under the BadHost moniker [7][8].

Starlette is the ASGI framework underpinning LiteLLM’s HTTP layer. In versions prior to 1.0.1, Starlette did not validate the HTTP Host header before incorporating it into the request.url object used by middleware to make routing and authentication decisions. A crafted Host header value causes Starlette’s URL reconstruction to produce a path that does not match the middleware’s authentication enforcement pattern, effectively bypassing the authentication check for any endpoint [7]. The vulnerability spans three independent layers: ASGI servers pass the raw Host header without validation, Starlette trusts it for URL construction, and middleware authors assume that request.url.path is safe to use for authentication decisions. The Belgian Centre for Cybersecurity observed that CVE-2026-48710 affects a large swath of Python AI infrastructure that depends on Starlette, including FastAPI services, vLLM inference servers, MCP server frameworks, agent harnesses, and evaluation dashboards [8].

Horizon3.ai researchers validated the full unauthenticated RCE chain in a controlled environment: a forged Host header that bypasses Starlette’s authentication middleware, combined with a POST to the vulnerable MCP test endpoint, triggers arbitrary subprocess execution on the LiteLLM host without any API key [1]. The combined attack chain carries a CVSS 10.0 Critical rating. Exploitation requires only that the attacker be able to send HTTP requests to the LiteLLM proxy — a condition satisfied by any publicly accessible deployment and, in enterprise environments without micro-segmentation, potentially by any host on the same internal network segment.

Disclosure, Patching, and Active Exploitation

LiteLLM released version 1.83.7 on May 8, 2026, introducing authorization controls that restrict the MCP test endpoints to the PROXY_ADMIN role and updating the Starlette dependency to 1.0.1 [6]. CISA confirmed active in-the-wild exploitation and added CVE-2026-42271 to the KEV catalog on June 8, 2026 — approximately five weeks after the patch was published [2][13]. The five-week gap between patch availability and KEV listing suggests threat actors moved to weaponize the advisory within weeks of its publication, though CISA’s confirmation timeline also reflects its internal validation requirements.

Security media reporting confirmed that exploitation activity began shortly after the public advisory was released, consistent with an attacker community that actively monitors vulnerability disclosures for AI infrastructure components and moves quickly to operationalize new findings [9][10]. Observed post-exploitation behavior includes web shell installation for persistent access, credential harvesting from proxy environment variables and configuration files, and — in Kubernetes-hosted deployments — attempted lateral movement using service account tokens accessible from the compromised container [11].

Exposure Scope and Target Profile

Given the breadth of AI program activity in sectors such as technology, finance, healthcare, and research, LiteLLM deployments likely span a wide range of organizations wherever unified LLM gateway patterns have been adopted. The gateway’s role as the routing and policy-enforcement layer for AI workloads means it frequently has network-level access to AI agent frameworks, retrieval-augmented generation pipelines, vector databases, and production data stores that are connected through the proxy. Attackers who successfully compromise a LiteLLM deployment gain not only a foothold on the gateway host but a vantage point into the AI workloads and data pipelines the proxy serves.

Recommendations

Immediate Actions

Organizations running any version of LiteLLM from 1.74.2 through 1.83.6 should treat this as an emergency upgrade. The patch in version 1.83.7 restricts the vulnerable MCP test endpoints to the PROXY_ADMIN role, eliminating the condition that allows non-admin users to trigger subprocess execution on the proxy host [1][6]. Starlette must be upgraded to version 1.0.1 or later in the same remediation cycle; without the Starlette update, the BadHost authentication bypass remains available and the full CVSS 10.0 unauthenticated chain is still exploitable even against the patched LiteLLM code.

For deployments where an immediate upgrade is not operationally feasible, the highest-impact interim control is to remove public internet access to the LiteLLM proxy and restrict network reachability to only the internal services that require it. This substantially reduces the unauthenticated attack surface from external networks while the upgrade is scheduled. Within the constrained internal perimeter, security teams should also audit proxy API key distribution and revoke keys held by users or services that do not have an active operational need, reducing the population of principals who could exploit CVE-2026-42271 without the BadHost bypass.

CISA’s KEV listing creates a mandatory remediation obligation for U.S. federal civilian agencies under BOD 26-04 [14]; non-federal organizations should treat the listing as a high-confidence signal of active exploitation and prioritize accordingly [2]. Any LiteLLM host should be treated as potentially compromised until both the upgrade and a compromise assessment are completed. Indicators of compromise include unexpected processes spawned by the LiteLLM service account, new or modified files in cron directories or systemd unit directories, unfamiliar outbound network connections from the proxy host, unexplained changes to environment variables, and new or modified API key entries in the LiteLLM database.

Short-Term Mitigations

Within thirty days, security teams should conduct a complete inventory of LiteLLM deployments across the organization, including informal instances created by development teams without formal security review. AI gateway software may be adopted by individual teams or projects without formal security oversight, and such deployments may be running older versions without monitoring coverage. All LLM provider API keys held by the LiteLLM proxy should be rotated as part of post-incident response, since the command injection vulnerability allows an attacker to read environment variables containing these credentials.

Network segmentation should ensure that LiteLLM proxies are accessible only to authorized services and users, not to broad internal segments. Security information and event management platforms should be configured to alert on requests to the MCP test endpoints even in patched deployments, as historically, patched vulnerability categories attract follow-on research for bypasses in related functionality. Organizations that used LiteLLM versions 1.82.7 or 1.82.8, installed during the forty-minute March 2026 supply chain window, should audit for the presence of unexpected .pth files in Python site-packages directories, unexplained systemd units, and outbound connections from the LiteLLM host to unfamiliar infrastructure [4][5].

Strategic Considerations

The convergence of multiple critical-severity vulnerabilities in LiteLLM within a single quarter reflects a broader dynamic that security architects should factor into AI infrastructure design. AI gateways and middleware components occupy a structurally privileged position in the AI stack — they hold credentials for every model provider they route to, observe all AI traffic, and often have access to data sources used for retrieval. LiteLLM’s vulnerable MCP test endpoints illustrate a risk pattern worth examining across AI middleware components more broadly: convenience features introduced to accelerate testing or experimentation may not receive the authorization controls appropriate for a gateway with this trust posture.

Security governance programs should establish explicit ownership, patch management obligations, and security review requirements for AI middleware components. The trust placed in an AI gateway warrants a security posture comparable to that applied to identity and access management systems, and procurement and deployment approval processes should evaluate AI middleware vendors accordingly. Evaluation criteria should include secure development lifecycle practices, vulnerability disclosure processes, and response timeliness.

The reach of CVE-2026-48710 beyond LiteLLM to the broader Starlette ecosystem warrants a targeted audit across the organization’s Python AI infrastructure. Any service built on FastAPI, vLLM, or other Starlette-based frameworks should be assessed for their Starlette version and upgraded to 1.0.1 to eliminate the BadHost authentication bypass condition, independent of any direct LiteLLM exposure [7][8].

CSA Resource Alignment

CVE-2026-42271 and the surrounding exploitation campaign map directly to threat categories and control domains within CSA’s published frameworks.

Within the MAESTRO framework for agentic AI threat modeling, the LiteLLM vulnerability engages Layer 1 (Infrastructure Security) and Layer 2 (AI Model and API Security). At the infrastructure layer, the absence of authorization controls on the MCP test endpoints represents a foundational failure of least-privilege access enforcement for infrastructure management operations. At the model and API layer, the gateway’s role as mediator of all LLM API traffic means a compromised proxy can intercept, modify, or exfiltrate requests and responses across every AI workload it serves — a threat scenario that MAESTRO’s Layer 2 analysis addresses directly. Agentic workflows that route through a compromised LiteLLM instance are vulnerable to prompt injection, output manipulation, and credential theft that affects every model interaction in the pipeline.

The AI Controls Matrix (AICM) addresses the control failures that made this vulnerability exploitable. AICM controls on identity and access management — requiring that administrative operations enforce role-based access controls rather than relying on a flat API key model — would have prevented non-admin users from triggering the vulnerable subprocess execution. AICM’s vulnerability management controls, covering patch cadence and exposure monitoring for AI tooling, are relevant to the five-week exploitation window that opened between patch publication and CISA KEV confirmation. AICM’s shared responsibility model applies here as well: the upstream vendor bears responsibility for shipping secure code, but the application operator retains responsibility for deployment configuration, key rotation practices, and network exposure.

CSA’s Zero Trust guidance applies to the exposure dimension of the unauthenticated chain. The BadHost attack succeeded in part because LiteLLM deployments were reachable from broad network segments with no identity verification beyond the Starlette middleware layer. A Zero Trust posture enforcing per-request identity verification at the network layer would likely have significantly constrained the blast radius of the Starlette bypass, reducing the viability of the zero-credential exploitation path for external attackers.

The CSA STAR program provides a mechanism for evaluating AI infrastructure vendors against baseline security controls. Organizations procuring AI middleware should use STAR assessments to verify that vendors maintain secure development lifecycle practices, respond to vulnerability disclosures within defined timelines, and publish clear remediation guidance with each advisory. The LiteLLM supply chain compromise and the subsequent RCE vulnerability illustrate that point-in-time procurement assessments are insufficient; continuous vendor security evaluation is required for components with the trust posture of an AI gateway.

References

[1] Horizon3.ai. “CVE-2026-42271: LiteLLM Unauthenticated RCE.” Horizon3.ai Attack Research, 2026.

[2] CISA. “CISA Adds Two Known Exploited Vulnerabilities to Catalog.” Cybersecurity and Infrastructure Security Agency, June 2026.

[3] LiteLLM. “LiteLLM — Call 100+ LLMs in a consistent format.” BerriAI, 2026.

[4] LiteLLM. “Security Update: Suspected Supply Chain Incident.” LiteLLM Blog, March 2026.

[5] Datadog Security Labs. “LiteLLM and Telnyx compromised on PyPI: Tracing the TeamPCP supply chain campaign.” Datadog Security Labs, 2026.

[6] SentinelOne. “CVE-2026-42271: Litellm Litellm RCE Vulnerability.” SentinelOne Vulnerability Database, 2026.

[7] OSTIF. “Disclosing the BADHOST Vulnerability in Starlette.” Open Source Technology Improvement Fund, 2026.

[8] Centre for Cybersecurity Belgium. “Warning: Vulnerability in Starlette framework and related frameworks like FastAPI exposes millions of servers to authentication bypass.” CCB, 2026.

[9] The Hacker News. “LiteLLM Flaw CVE-2026-42271 Exploited in the Wild, Chains to Unauthenticated RCE.” The Hacker News, June 2026.

[10] Help Net Security. “LiteLLM vulnerability under active attack, CISA warns (CVE-2026-42271).” Help Net Security, June 2026.

[11] Rescana. “Active Exploitation Alert: CVE-2026-42271 and CVE-2026-48710 — Unauthenticated RCE in LiteLLM AI Gateway via Starlette Host Header Bypass.” Rescana, 2026.

[12] Kaspersky. “Trojanization of Trivy, Checkmarx, and LiteLLM solutions.” Kaspersky Blog, 2026.

[13] SOCRadar. “CISA KEV Highlights LiteLLM RCE (CVE-2026-42271) & Check Point VPN Auth Bypass.” SOCRadar, June 2026.

[14] CISA. “BOD 26-04: Prioritizing Security Updates Based on Risk.” CISA, June 2026.

← Back to Research Index