Codex Compromised: npm Supply Chain Steals AI Developer Tokens

Authors: Cloud Security Alliance AI Safety Initiative
Published: 2026-06-06

Categories: Developer Security, Software Supply Chain Security, AI Tool Security, Credential Protection

Codex Compromised: npm Supply Chain Steals AI Developer Tokens

Key Takeaways

  • Aikido Security disclosed on May 27, 2026 that the npm package codexui-android—a remote web UI for OpenAI Codex with approximately 29,000 weekly downloads—had been silently exfiltrating developer authentication tokens for roughly one month before discovery [1][2].
  • The attack followed an apparent trust-building strategy: the package operated legitimately for its first month of deployment, accumulating a real user base before a subsequent npm update injected the credential-theft payload; the public GitHub repository remained entirely clean throughout, making source-level audits ineffective [1][3].
  • Stolen data included the full OAuth bundle—access token, refresh token, ID token, and account ID—sent encrypted to an attacker-controlled server at sentry.anyclaw.store/startlog; the long-lived refresh token represents the most dangerous component, enabling extended silent impersonation of the victim without triggering new authentication events [1][4].
  • The developer identity “BrutalStrike,” which published the malicious npm package, also distributed similar applications on Google Play with a combined total of more than 60,000 additional installs, broadening the attack surface well beyond npm [2][5].
  • This incident is part of an accelerating pattern of npm supply chain attacks in 2026 that specifically target AI developer toolchains; the TrapDoor campaign alone spread 34 malicious packages in more than 384 versions [7], and related campaigns including Mini Shai-Hulud and the Miasma worm have distributed malicious code across npm, PyPI, GitHub Actions, Docker Hub, and the VS Code Marketplace, with AI service credentials increasingly the primary target [8][15].
  • Developers who installed any version of codexui-android after its initial publication should treat their OpenAI Codex credentials as fully compromised and rotate all associated tokens immediately.

Background

OpenAI Codex is a cloud-hosted AI coding agent platform that allows developers to delegate software engineering tasks—writing, testing, and debugging code—to an autonomous AI system operating in a sandboxed cloud environment. Unlike a passive autocomplete tool, Codex operates with read and write access to a developer’s repositories and executes code on their behalf, meaning that a compromised Codex credential grants an adversary not merely access to conversations but to the full scope of what the developer’s account can do programmatically.

The npm package ecosystem, which hosts more than three million packages and serves billions of weekly downloads [17], has long been a target for supply chain attacks. The traditional approach involved typosquatting—registering packages with names nearly identical to popular libraries—to catch developers who mistype dependency names [6]. A more sophisticated variant emerged with dependency confusion attacks, which exploit the way private package managers resolve package names when the same name exists in both a private registry and the public npm repository [7]. The codexui-android case represents a third and more operationally sophisticated pattern: the deliberate cultivation of legitimacy over time before the introduction of a malicious payload, sometimes called a “trust-then-poison” attack.

The package was published on npm as a functional remote web UI for OpenAI Codex, enabling developers to interact with the Codex agent from a browser rather than the official client. It worked as advertised. The associated GitHub repository showed active development and no evidence of malicious code—because the malicious code was not stored there. The payload was instead pulled at runtime from a remote source each time the tool launched, allowing the GitHub source to remain clean and rendering any audit of the published source meaningless [1][3]. At the time of disclosure, the package had accumulated approximately 29,000 weekly downloads, representing a substantial pool of developers whose credentials may have been compromised for up to thirty days before discovery.

Security Analysis

The Trust-Building Attack Pattern

The observed sequencing—a delay of approximately one month before payload injection—suggests a sophisticated understanding of how developers evaluate npm packages. Community trust in an npm package typically builds through download counts, GitHub star accumulation, issue tracker activity, and the age of the package’s releases—all signals that the codexui-android author apparently cultivated before activating the exfiltration code. A newly published package with malicious functionality is more likely to be flagged by automated scanners that look for known-bad payloads in newly submitted packages; an established package that receives a malicious update within a routine-looking version increment is far less likely to trigger immediate review [1][3].

The decision to keep the GitHub repository clean appears equally deliberate. Many npm package scanning tools and developer security workflows check the package’s linked source repository as a reference point for what should be in the published artifact. By keeping the GitHub code clean and pulling the malicious logic remotely at runtime, the attacker exploited the gap between what a developer sees in the repository and what actually executes on their machine. This technique—sometimes called a publish-time or install-time remote fetch—is not reliably detectable by static analysis of the package’s source code or the npm tarball contents at rest [1][4]. It requires dynamic analysis of the package’s runtime behavior.

Token Exfiltration Mechanics

The technical implementation of the credential theft was both straightforward and effective. When the tool launched, it located the Codex authentication file, typically stored at ~/.codex/auth.json, which is written to disk by the official Codex client during the user’s sign-in flow. The package read this file, applied XOR encryption using the hardcoded key anyclaw2026, base64-encoded the result, and issued an HTTPS POST request to sentry.anyclaw.store/startlog [1][4]. The use of HTTPS and a domain name that superficially resembles a legitimate error-monitoring service—Sentry is a widely used crash reporting and observability platform—was likely intended to disguise the exfiltration traffic in network logs.

The captured auth bundle contained four fields: the access token, which provides direct API access; the ID token, which encodes the user’s identity claims; the account ID; and, most critically, the refresh token. Access tokens are typically short-lived—OpenAI’s Codex access tokens expire after a matter of hours—but refresh tokens are designed to obtain new access tokens without requiring the user to authenticate again. They are long-lived and valid until explicitly revoked by the account holder or authorization server. An adversary holding a valid refresh token can silently request new access tokens on an ongoing basis, accessing the victim’s Codex account, spending their API credits, reading the code and tasks the victim has submitted to the agent, and impersonating the victim in interactions with OpenAI’s services, all without triggering new authentication events on the victim’s devices [1][4][5].

The same exfiltration chain appeared in two Android applications distributed through the Google Play Store under the BrutalStrike developer identity: an app identified by the package name codex.app with more than 10,000 downloads, and a second application called “OpenClaw Codex Claude AI Agent” with over 50,000 downloads [2][5]. The Android delivery vector is notable because mobile application distribution does not share the same provenance and integrity model as npm; app store review processes scan for known malware signatures but are generally unable to detect runtime remote-fetch patterns that pull malicious code after installation. The combined footprint across npm and Google Play suggests the attacker operated a coordinated, multi-platform campaign rather than an opportunistic single-package experiment.

A Broader Surge in AI Developer Credential Targeting

The codexui-android disclosure is the most prominent recent instance of a trend that security researchers have been tracking throughout 2026: adversaries increasingly targeting the credentials and secrets stored on or accessible by AI developer tooling. Developers who work with AI coding agents typically hold unusually broad access rights—access to source repositories, cloud environments, CI/CD pipelines, and secrets management systems—because these agents need that access to perform their tasks. A credential that compromises a developer using an AI coding tool therefore yields access not just to the developer’s own workstation but potentially to the infrastructure their agent is authorized to touch.

The Mini Shai-Hulud campaign, attributed to a threat actor group tracked as TeamPCP and UNC6780, illustrates the systemic character of this threat. Beginning in early 2026 and accelerating through May, the campaign compromised packages across npm, PyPI, GitHub Actions, Docker Hub, and the VS Code Marketplace through techniques including pull-request workflow poisoning, GitHub Actions cache injection, and token memory extraction from CI/CD runners [15]. In May 2026, TanStack—a widely used open-source collection of JavaScript libraries—was compromised as part of this campaign; OpenAI confirmed that two employee devices were impacted by the TanStack compromise and that limited credential material was exfiltrated from internal source code repositories [9]. In a related incident, a compromised VS Code Marketplace extension led to the exfiltration of roughly 3,800 internal GitHub repositories from an employee’s device [16].

Beyond Mini Shai-Hulud, the TrapDoor campaign spread 34 malicious packages across npm, PyPI, and Crates.io in more than 384 versions, targeting developer credentials and establishing persistence [7]. The Miasma worm compromised @redhat-cloud-services npm packages with a self-propagating credential-stealing payload that exploited binding.gyp—a build configuration file that triggers execution during npm install without touching package.json scripts, bypassing the postinstall script audits that many developers treat as a sufficient security check [8]. Across these campaigns, stolen CI/CD tokens enabled downstream supply chain pivoting, stolen npm publish tokens were used to compromise additional packages, and stolen cloud credentials enabled lateral movement into production environments.

Attack Surface and Scope

The aggregate scope of the codexui-android campaign—approximately 29,000 weekly npm installs plus more than 60,000 Android installs across the two companion applications—represents a meaningful pool of exposed developer credentials. Not every install necessarily resulted in a credential compromise; a developer who installed the package but never signed in to Codex through it, or who used a Codex account with no sensitive API credits or repository access, faces a reduced risk profile. However, the nature of Codex’s intended use—specifically interacting with repositories and executing tasks on behalf of the developer—suggests that many of the affected users did authenticate and therefore did have their tokens captured.

The one-month window of active exfiltration before disclosure is particularly significant from a risk standpoint. Refresh tokens obtained early in that window have been available to the attacker for weeks. Any attacker who retained and actively used those tokens has had an extended opportunity to consume API credits, read submitted code, or take actions under the victim’s identity that would not surface as anomalous in typical security monitoring, since they would appear as legitimate API calls under the victim’s account ID.

Recommendations

Immediate Actions

Developers who installed codexui-android at any point, or who used either of the associated Android applications, should treat their OpenAI Codex credentials as fully compromised. The appropriate immediate response is to revoke all OpenAI API keys and Codex authentication tokens through the OpenAI account settings panel, sign out of all Codex sessions, and review the OpenAI account’s usage logs for any API calls or Codex invocations that were not initiated by the account holder. Because refresh tokens enable impersonation without triggering new login events, abnormal usage may not be immediately apparent from authentication logs alone; organizations should look for unexplained API credit consumption or code submissions as secondary indicators.

Security teams should also review their software asset inventories for any organizational use of the affected package or its companion Android applications. Developers who used their organizational Codex credentials through these tools—particularly if those credentials were also used for accessing corporate code repositories or shared API key pools—should escalate their exposure assessment to include those downstream resources.

Short-Term Mitigations

Organizations should implement dependency integrity controls that go beyond source-code review. Locking package versions using exact version pinning and cryptographic integrity verification, such as npm’s --package-lock-only enforcement and package-lock.json with integrity hashes, reduces but does not eliminate exposure to trust-then-poison attacks that update through normal versioning. Supplementing version locks with a software composition analysis tool capable of behavioral monitoring—not merely static signature matching—provides a more robust detection layer for runtime remote-fetch patterns.

For AI developer tooling specifically, organizations should adopt a principle of minimal credential scope: Codex API keys and developer OAuth tokens used with third-party tooling should be scoped to the minimum necessary permissions and should not be shared with the same credentials used to access sensitive code repositories or production infrastructure. Segregating AI service credentials from code repository access credentials limits the blast radius of any single compromise.

A network-level control—monitoring or blocking outbound connections to unknown endpoints during npm install execution and tool invocation—can catch runtime remote-fetch exfiltration that bypasses static analysis. Threat actors specifically design their payloads to blend into legitimate traffic, as seen with the sentry.anyclaw.store domain name, so domain-based allowlisting is more effective than blocklist-based approaches for development environments with predictable egress patterns.

Strategic Considerations

The codexui-android campaign exemplifies a threat category that is likely to grow as AI developer tooling becomes more deeply embedded in software development workflows. AI coding agents, by design, hold broad credentials and execute code on behalf of developers; any tool that sits between the developer and the agent becomes a high-value interception point. Organizations should build procurement and approval processes for AI developer tools that include dynamic behavioral analysis, not merely source code review or vendor reputation checks.

The multi-platform character of this campaign—npm and Google Play simultaneously—indicates that adversaries are designing their operations for coverage across the developer workflow, not just one attack surface. Security policies governing which npm packages developers may install should be complemented by mobile application management policies that extend similar controls to AI tooling installed on developer mobile devices, which frequently share credentials with desktop development environments.

Finally, the repeated targeting of developer credentials in 2026’s npm supply chain campaigns points toward a structural gap in how the software industry treats developer environments as a security perimeter. In many organizations, developer machines fall outside the endpoint detection and response controls applied to production systems; developer credentials are frequently scoped more broadly than production service accounts because developer workflows require broad access. Closing this gap—by extending endpoint detection to developer machines, applying least-privilege discipline to developer tokens, and implementing behavioral monitoring on AI tool execution—is a security investment that addresses not just this specific campaign but the underlying attack surface that adversaries have identified as productive.

CSA Resource Alignment

The threat model illustrated by the codexui-android campaign maps directly to several CSA frameworks that provide structured guidance for organizations assessing and remediating AI developer toolchain risk.

The CSA MAESTRO framework for agentic AI threat modeling identifies supply chain compromise as a cross-layer risk affecting the deployment infrastructure and external systems integration layers. MAESTRO’s analysis of trust boundaries between human developers, orchestration layers, and AI agents is directly applicable to the codexui-android attack pattern, in which a compromised intermediary tool positioned between the developer and the Codex agent became the point of credential exfiltration. Organizations applying MAESTRO threat models to their AI coding agent deployments should explicitly include the tooling ecosystem—npm packages, IDE extensions, mobile companion applications—within the agent’s trust boundary scope [10].

The CSA AI Controls Matrix (AICM) provides the primary organizational framework for addressing both the non-human identity security issues and the supply chain risks illustrated by this campaign. AICM’s guidance on managing machine identities—including API keys, OAuth tokens, and service account credentials used by AI systems—applies directly to the credential hygiene practices organizations should enforce for Codex and similar AI coding agents [12]. CSA’s research on non-human identities in AI systems, which documents limited organizational visibility into the credentials used by autonomous AI agents, describes the context in which the codexui-android campaign was able to operate undetected for thirty days. AICM builds on and extends the Cloud Controls Matrix (CCM), whose Supply Chain Management, Transparency, and Accountability domain includes controls for vendor assessment, software integrity verification, and third-party component management—precisely the organizational practices that would have detected or prevented the codexui-android compromise [11][12].

CSA’s Zero Trust guidance, particularly the principle of verify-explicitly for all credential usage, supports the network-level egress monitoring and behavioral anomaly detection recommendations in this note. A zero trust network architecture that requires explicit justification for outbound connections from developer environments would have created an additional detection layer for the sentry.anyclaw.store exfiltration traffic [13].

CSA’s Software Transparency: Securing the Digital Supply Chain publication provides a framework for applying SBOMs and software provenance verification to third-party development dependencies. The gap between source repository contents and published npm artifacts that the codexui-android attacker exploited is precisely the gap that software transparency controls—including artifact signing and build provenance attestation—are designed to close [14].

References

[1] Aikido Security. “Legitimate-Looking Codex Remote UI Secretly Steals Your AI Tokens.” Aikido Security Blog, May 27, 2026.

[2] The Hacker News. “OpenAI Codex Authentication Tokens Stolen in codexui-android npm Supply Chain Attack.” The Hacker News, June 2026.

[3] TechRadar. “OpenAI Codex Tool with Over 29,000 Downloads Linked to Malicious npm Supply Chain Attack Stealing Authentication Tokens.” TechRadar, June 2026.

[4] The Cyber Signal. “codexui-android npm Package Is Stealing OpenAI Codex Tokens.” The Cyber Signal, June 2026.

[5] AndroidHeadlines. “OpenAI Codex Users Targeted by Infostealer Malware.” AndroidHeadlines, June 2026.

[6] Microsoft Security Blog. “Typosquatted npm Packages Used to Steal Cloud and CI/CD Secrets.” Microsoft Security, May 28, 2026.

[7] The Hacker News. “TrapDoor Supply Chain Attack Spreads Credential-Stealing Malware via npm, PyPI, and CratesIO.” The Hacker News, May 2026.

[8] Microsoft Security Blog. “Preinstall to Persistence: Inside the Red Hat npm Miasma Credential-Stealing Campaign.” Microsoft Security, June 2, 2026.

[9] OpenAI. “Our Response to the TanStack npm Supply Chain Attack.” OpenAI, 2026.

[10] Cloud Security Alliance. “Agentic AI Threat Modeling Framework: MAESTRO.” CSA Blog, February 6, 2025.

[11] Cloud Security Alliance. “Cloud Controls Matrix and CAIQ v4.1.” CSA, January 2026.

[12] Cloud Security Alliance. “Identity and Access Gaps in the Age of Autonomous AI.” CSA, March 2026.

[13] Cloud Security Alliance. “Zero Trust Guidance for Achieving Operational Resilience.” CSA, April 2026.

[14] Cloud Security Alliance. “Software Transparency: Securing the Digital Supply Chain.” CSA, 2025.

[15] Security Boulevard. “Mini Shai-Hulud: Frequently Asked Questions about the TeamPCP npm and PyPI Supply Chain Campaign.” Security Boulevard, May 2026.

[16] The Hacker News. “GitHub Internal Repositories Breached via Malicious Nx Console VS Code Extension.” The Hacker News, May 2026.

[17] Palo Alto Networks Unit 42. “The npm Threat Landscape: Attack Surface and Mitigations.” Unit 42, June 2026.

← Back to Research Index