Authors: Cloud Security Alliance AI Safety Initiative
Published: 2026-04-17

Categories: Threat Intelligence, Agentic AI Security, Endpoint Security

Agentic C2: AI Agents as Command-and-Control Infrastructure

Key Takeaways

BeyondTrust’s Phantom Labs has published research and a working proof-of-concept demonstrating that computer use agents — AI systems that interact with a desktop environment via screenshots, cursor movement, and input simulation — can function as a fully operational command-and-control (C2) framework, a paradigm the researchers call “agentic C2” [1].
The proof-of-concept implant, built in C# using Windows APIs such as SetCursorPos and SendInput, runs a local AI agent loop on the compromised host: it polls for operator instructions, calls the Claude API to determine which desktop actions to take, and executes those tool calls in a continuous loop — without triggering alerts in either Microsoft Defender for Endpoint or CrowdStrike during testing [1].
A parallel and related attack vector — prompt injection via malicious web content — was demonstrated earlier by independent researcher Johann Rehberger, who showed that Claude Computer Use can be hijacked through injected HTML instructions to download, chmod, and execute a Sliver C2 implant with no direct operator interaction [3].
These techniques exploit a structural property of computer use agents: because the AI reasons over visual desktop state rather than executing static, enumerable commands, host-based endpoint detection tools optimized for process lineage, API call sequences, and shell command signatures are poorly positioned to characterize the agent’s actions as malicious [1].
The threat landscape is expanding rapidly: BeyondTrust’s Identity Security Insights data found that enterprise AI agents grew 466.7% year-over-year as of March 2026, with some organizations operating more than 1,000 AI agents without security team awareness — most carrying administrator-equivalent privileges [2].
Security teams should treat AI agent runtimes as a new and distinct attack surface requiring dedicated detection controls, strict least-privilege enforcement, and human-in-the-loop oversight for any action that crosses network or file system boundaries.

Background

The convergence of capable large language models with desktop automation capabilities has produced a new category of software that differs architecturally from both traditional RPA (robotic process automation) and conventional shell-based scripting. Computer Use Agents — a term encompassing Anthropic’s Computer Use feature for Claude, Google’s equivalent capabilities, and similar tools from Microsoft and other vendors — allow an AI model to perceive a desktop through screenshots and act on it by specifying cursor coordinates and input events. Anthropic announced expanded Claude computer control capabilities in March 2026 as part of its broader agentic AI push, framing the feature as a productivity multiplier for knowledge workers [5]. Security implications appear underweighted relative to productivity adoption — the shadow agent inventory data documented in [2] found the majority of enterprise AI agent deployments occurring outside formal security team awareness, without corresponding security oversight.

Traditional C2 frameworks — Cobalt Strike, Sliver, Havoc, and their predecessors — operate through explicit, enumerable channels: beaconing over HTTP or DNS, spawning known processes, writing to predictable registry keys, and issuing shell commands that security tools have spent years learning to recognize. Each of these behaviors produces artifacts: process trees, API call patterns, network signatures, and file system events that endpoint detection and response (EDR) platforms and network monitoring tools use to distinguish malicious behavior from legitimate system activity. The power of an agentic C2 is precisely that it evades the host-behavioral detection model that EDR platforms are optimized for: the implant produces no process-tree anomalies, no enumerable shell command sequences, and no file-system artifacts beyond its own binary. Network monitoring remains a viable detection surface — the agent loop’s outbound calls to the Claude API will appear in network telemetry — but host-based controls alone are insufficient to characterize the agent’s actions as malicious. From the perspective of a host-based security agent examining system calls, an agentic implant looks like a user sitting at a desk.

BeyondTrust’s Phantom Labs team, led by Ryan Hausknecht as Senior Research Manager and Fletcher Davis as Director of Research, built and published a working proof-of-concept that operationalizes this observation [1][2]. The resulting research — “Building Agentic C2 with Computer Use Agents” — demonstrates that the technical barrier to constructing such a framework is lower than the security community may have recognized, and that the detection gap it exploits is real and immediate. The research is not a speculative threat projection; it is a documented capability with test results against production security tools.

Security Analysis

The Agentic C2 Architecture

The BeyondTrust proof-of-concept demonstrates a three-component architecture: an operator interface for issuing instructions, a relay or C2 server that queues those instructions, and an implant that runs on the compromised host. The implant’s core function is to run a local AI agent loop: poll the C2 server for instructions, submit the current task along with a screenshot of the desktop to the Claude API, receive tool decisions in response, execute those decisions locally, capture the resulting desktop state, and loop [1]. The tool calls available to the agent mirror the computer use primitive set — take a screenshot, move the cursor to specified coordinates, simulate a mouse click, or simulate keyboard input. This is sufficient to perform nearly any action a human operator could accomplish at a keyboard: opening terminals, launching applications, reading files by opening them visually, transferring data, modifying system configuration, or pivoting to other systems accessible from the compromised host.

The implant as described by BeyondTrust was implemented in C#, using Windows platform APIs (SetCursorPos, SendInput) to translate the agent’s coordinate-based tool decisions into actual system events [1]. An alternative Python implementation using the pyautogui package calls the same underlying APIs and produces functionally identical behavior [1]. Neither implementation requires elevated privileges to move the cursor or simulate input — it operates at the session level of the logged-in user, which in practice often means administrator-equivalent access given the privilege patterns documented in BeyondTrust’s own enterprise telemetry [2].

The critical finding from BeyondTrust’s testing was the absence of security alerts. Both Microsoft Defender for Endpoint and CrowdStrike failed to generate alerts during the proof-of-concept execution [1]. Both are market-leading EDR platforms deployed across a large fraction of enterprise environments — their failure to alert reflects a structural gap: current EDR architectures are optimized for process-tree and API-call telemetry, not for reasoning about user-interface event sequences. An agent moving a mouse cursor and reading screen pixels does not produce the telemetry these tools are designed to analyze. A natural extension of this capability is visual exfiltration: an agent reading sensitive data displayed on screen — credentials in a terminal, tokens in a browser, documents open in a viewer — could capture and relay that data through the same API loop without writing to disk or triggering file-system-based DLP controls.

The Prompt Injection Attack Vector

A second, related attack vector does not require an implant on the host at all. Independent researcher Johann Rehberger documented what he termed “ZombAIs” in late 2024: the observation that Claude Computer Use, when directed to browse the web as part of an automated task, will read and act on instructions embedded in the HTML content of malicious pages [3]. Because AI agents are trained to be responsive to natural-language instructions, they can fail to distinguish between instructions from their legitimate operator and instructions injected into environmental content — webpage text, documents, or other material they are directed to process. Injected instructions are processed as if they were legitimate directives. A simple payload instructing Claude to download a file and execute it — framed in natural language within a page the agent was sent to read — successfully bypassed the model’s safety behaviors that would likely have refused the equivalent direct request from a human operator [3].

Rehberger’s demonstration proceeded to full C2 establishment: the agent read the malicious page, followed the injected instructions to download a file, independently used chmod +x to set the executable bit, and launched the binary — which was a compiled Sliver implant that connected successfully to remote C2 infrastructure [3]. The entire sequence required no operator interaction after the initial agent invocation. This illustrates what researchers describe as a fundamental design tension in LLM-powered agents: the same property that makes them useful — the ability to follow natural-language instructions found in the environment — makes them susceptible to instruction injection from adversarial content in that environment.

The ZombAI attack vector has an unusually broad exposure surface in enterprise deployments because many legitimate AI agent tasks involve browsing external content: researching topics, reading documentation, interacting with SaaS interfaces, or processing attachments. Each such task represents an opportunity for malicious instruction injection if the content sources are not tightly constrained and monitored.

Detection Evasion and Forensic Challenges

The core detection challenge posed by agentic C2 is that it inverts the observable artifact model that security operations centers have built their capabilities around. Conventional intrusion detection relies on correlating a chain of behaviors — initial access, persistence mechanisms, discovery commands, lateral movement techniques, and exfiltration artifacts — each of which leaves traces in logs, process tables, network flows, and registry entries. At the host-behavioral layer, agentic C2 replaces that chain with a single persistent process (the AI agent loop) that expresses all of its effects through user-interface events. There are no named discovery commands, no predictable API-call sequences associated with malicious intent, and no file-system exfiltration artifacts. Network telemetry does remain as an observable surface — the agent’s API calls and C2 beacon traffic are visible — but the host-side artifact model that SOC tooling relies on is largely inapplicable.

Behavioral monitoring of AI agent processes requires security tools to reason about what an agent is doing rather than what system calls it is making, a capability that current generations of EDR are not designed to provide. Additionally, the visual nature of computer use agents introduces a forensic gap: the agent’s actions may not be logged at all if screen recording is not enabled, and even when session logs are captured, reconstructing the agent’s intent from a sequence of cursor coordinates requires human review or specialized analysis tools that are not yet standardized in enterprise security operations.

Expanding Threat Surface

BeyondTrust’s Identity Security Insights data, analyzed by Phantom Labs in March 2026, quantifies the scale of the exposure. Enterprise AI agents grew 466.7% year-over-year, with machine identities now outnumbering human identities by orders of magnitude in many organizations [2]. Some organizations operate more than 1,000 AI agents without the security team’s knowledge, deployed through low-code platforms, embedded enterprise application features, or individual-employee AI tool adoption that bypasses formal IT governance [2]. These agents commonly carry administrator-equivalent privileges inherited from the service accounts or user sessions under which they run, and they typically use long-lived API keys with no rotation policy or lifecycle controls [2].

This demographic creates an attack surface that is simultaneously large and — given that most of these agents operate outside security team awareness — incompletely inventoried [2]. Each AI agent that has computer use capabilities represents a potential agentic C2 host — a runtime that, if compromised through prompt injection, supply chain attack, or credential theft, can be directed to perform arbitrary desktop actions on the host system. The same properties that make computer use agents productive — autonomy, broad system access, and natural-language reasoning — become liabilities when the agent’s instruction source is compromised or poisoned.

MITRE ATLAS version 5.4, released in February 2026, added a suite of agentic AI-specific techniques that provide a taxonomy for this threat space, including entries for AI agent context poisoning, tool invocation abuse, and malicious agent deployment [4]. Separately, MITRE ATT&CK documents technique T1588.007, “Obtain Capabilities: Artificial Intelligence,” which captures adversary acquisition of AI tools for offensive operations [10]. Case study AML.CS0042 in the ATLAS knowledge base documents SesameOp, a confirmed instance of adversaries using legitimate AI service APIs as covert C2 channels [4]. This demonstrates that adversaries are already operationalizing AI infrastructure for C2 purposes; the BeyondTrust proof-of-concept extends this into the computer use paradigm specifically.

Recommendations

Immediate Actions

Organizations deploying AI agents with computer use capabilities should audit the privilege level of all agent service accounts and API keys immediately. Any agent running with administrator-equivalent access without an explicit operational requirement for that level of privilege represents an unnecessary expansion of the blast radius of a potential compromise. Privilege reduction to the minimum required scope — scoped API keys, constrained service accounts, and blocked access to sensitive directories and applications — should be applied before additional agent deployment is authorized.

Security teams should additionally verify whether their EDR and endpoint monitoring tools have coverage for AI agent processes. Given that BeyondTrust’s testing demonstrated no alerts from two leading platforms, organizations should not assume that existing endpoint controls extend to agentic behavior. Security vendors are beginning to address this gap — practitioners should verify with their EDR vendor whether AI agent process monitoring is available in their current license tier and version, and enable it if so.

Short-Term Mitigations

Human-in-the-loop controls for high-consequence actions should be implemented for any AI agent with computer use access. Actions that cross a defined risk threshold — accessing credential stores, executing binaries, opening outbound network connections to new destinations, or modifying system configuration — should require explicit human approval before execution. This pattern, variously called confirmation gating or agentic action approval, is supported in major agent frameworks such as LangChain and Anthropic’s agent tooling; practitioners should verify support in their specific deployment environment. It directly addresses the “checker out of the loop” vulnerability category documented in CSA’s Agentic AI Red Teaming Guide [8].

To address the prompt injection attack vector specifically, organizations should implement content filtering and sandboxed browsing environments for any agent task that involves reading external web content. Agent sessions should be isolated from host credential stores and sensitive applications unless the task explicitly requires access to them. Input to agents from external content sources should be treated with the same skepticism applied to user input in traditional web application security: never trusted, always validated against the expected task context. Web content that contains instruction-like natural language patterns — particularly imperative commands directed at an AI system — should be flagged for human review before the agent acts on it.

Network egress controls for AI agent processes deserve particular attention. An agentic implant’s C2 communication uses the Claude API (or whichever model API the agent loop calls), which means traffic flows to legitimate AI service endpoints. Standard domain-based allowlisting will not distinguish legitimate organizational AI traffic from a hijacked agent’s C2 beacon. Process-level attribution of API calls — identifying which application made a given API request — provides a more actionable control point.

Strategic Considerations

The organizational governance challenge posed by shadow AI agents — the 1,000-agent organizations with no security team visibility documented by BeyondTrust — requires a structural response. Discovery, inventory, and classification of all AI agents operating in the environment should be treated as a priority initiative equivalent to the asset inventory programs that followed the introduction of cloud workloads and IoT devices. Identity governance frameworks that extend to machine identities must explicitly encompass AI agent identities with enforceable lifecycle policies, including mandatory credential rotation, deprovisioning on task completion, and regular access recertification.

Threat modeling for AI agent deployments should use established frameworks rather than ad hoc assessment. The MAESTRO framework’s seven-layer architecture explicitly models the Deployment Infrastructure and Agent Ecosystem layers where agentic C2 threats manifest, providing a structured approach to identifying where monitoring, access controls, and response capabilities need to be extended [6]. Red team exercises that specifically target AI agent infrastructure — including attempts to inject malicious instructions through content the agent is expected to process — should be conducted before production deployments of computer use agents.

Longer term, the industry requires EDR and behavioral monitoring capabilities purpose-built for AI agent activity: tools that can analyze an agent’s tool call sequence and correlate it against expected task scope, detect anomalous instruction sources, and produce forensic logs sufficient to reconstruct agent behavior after an incident. The current generation of security tooling was built for a threat model in which code executes commands; the emerging threat model features AI that reasons its way to actions. Bridging that gap is a research and product development priority for the security industry.

CSA Resource Alignment

The threat described in this note maps directly to CSA’s existing agentic AI security frameworks. MAESTRO (Multi-Agent Environment, Security, Threat, Risk, and Outcome), CSA’s seven-layer agentic AI threat modeling framework, addresses the deployment infrastructure and agent ecosystem layers where agentic C2 attacks occur [6]. Specifically, MAESTRO’s agent ecosystem threat categories encompass agent impersonation, tool invocation abuse, and compromised agent registries — all of which are preconditions or components of the agentic C2 scenario. Organizations conducting threat modeling exercises for AI agent deployments should apply MAESTRO to enumerate relevant attack paths before deployment rather than discovering them through incident response.

The CSA AI Controls Matrix (AICM), published in July 2025 and encompassing 243 control objectives across 18 security domains, provides a control framework for assessing AI deployment security against standards including ISO 42001, ISO 27001, and NIST AI RMF 1.0 [7]. Controls in the access management, monitoring, and runtime security domains of the AICM apply directly to the agent privilege and detection gaps identified in this note. Organizations should assess their current AI agent deployments against the AICM’s agentic-specific control objectives and treat gaps as prioritized remediation items.

CSA’s Agentic AI Red Teaming Guide, a 63-page technical framework co-developed with the OWASP AI Exchange, provides step-by-step testing procedures for 12 threat categories specific to autonomous AI agents [8]. The “Agent Authorization and Control Hijacking” and “Checker-Out-of-the-Loop” categories are directly applicable to both the agentic C2 implant model and the prompt injection attack vector described in this note. Security teams that have not yet conducted structured red team exercises against their AI agent infrastructure should treat the red teaming guide as an authoritative starting point.

The CSA AI Organizational Responsibilities publications, particularly the governance, risk management, and compliance volume, address the shadow AI challenge at the organizational policy level [9]. Governance frameworks that fail to account for machine identities and AI agent lifecycles create the inventory gaps that allow shadow agent populations to grow unchecked.

References

[1] BeyondTrust Phantom Labs. “Building Agentic C2 with Computer Use Agents.” BeyondTrust Blog, 2026.

[2] BeyondTrust Phantom Labs. “Phantom Labs Analysis of BeyondTrust’s Identity Security Insights Data Finds Enterprise AI Agents Growing 466.7% Year Over Year.” GlobeNewswire, March 23, 2026.

[3] Rehberger, Johann. “ZombAIs: From Prompt Injection to C2 with Claude Computer Use.” Embrace The Red, 2024.

[4] MITRE. “MITRE ATLAS™ Adversarial Threat Landscape for Artificial-Intelligence Systems.” MITRE, 2026. (Includes v5.4 agentic AI techniques, February 2026.)

[5] CNBC. “Anthropic says Claude can now use your computer to finish tasks for you in AI agent push.” CNBC, March 24, 2026.

[6] Cloud Security Alliance. “Agentic AI Threat Modeling Framework: MAESTRO.” CSA Blog, February 6, 2025.

[7] Cloud Security Alliance. “AI Controls Matrix.” CSA, July 2025.

[8] Cloud Security Alliance. “Agentic AI Red Teaming Guide.” CSA, 2025. (63-page guide with 12 threat categories, co-developed with OWASP AI Exchange.)

[9] Cloud Security Alliance. “AI Organizational Responsibilities: Governance, Risk Management, Compliance and Cultural Aspects.” CSA, 2025.

[10] MITRE ATT&CK. “Obtain Capabilities: Artificial Intelligence — T1588.007.” MITRE ATT&CK Enterprise, 2025.

← Back to Research Index