Authors: Cloud Security Alliance AI Safety Initiative
Published: 2026-05-18

Categories: Agentic AI Security, Threat Intelligence, Enterprise Security

Living Off the Agent

Executive Summary

Enterprises are deploying autonomous AI agents at scale. These agents connect to sensitive data stores, execute code, invoke APIs, send communications, and operate under delegated human authority. Security teams have spent decades learning to detect attackers who abuse native operating system tools—a class of behavior known as Living Off the Land (LOTL). A structurally analogous threat is now emerging against the enterprise AI layer.

The “Living Off the Agent” (LOTA) attack class describes a family of techniques by which adversaries gain unauthorized influence over AI agent behavior, then weaponize the agent’s own trusted access, tooling, and autonomy against the enterprise that deployed it. Rather than deploying malware or exploiting kernel vulnerabilities, the LOTA attacker injects instructions into the agent’s reasoning process, poisons the tools the agent relies on, or compromises the agent’s identity and credentials. The agent then acts—legally, from the perspective of downstream systems—on behalf of the attacker.

LOTA attacks are structurally distinct from conventional cyberattacks in two critical respects. First, they require no novel malware, because the agent’s own legitimate capabilities constitute the attack infrastructure. Second, they are difficult to detect, because the agent’s behavior—browsing documents, calling APIs, drafting communications—looks identical whether it is acting on legitimate instructions or adversarial ones.

The threat is no longer hypothetical. In September 2025, a state-sponsored intrusion campaign used compromised AI coding agents to conduct autonomous cyber espionage across approximately thirty organizations, with the agent handling an estimated 80 to 90 percent of the tactical operation without human intervention [1]. Researchers have demonstrated that a single malicious email delivered to a Microsoft 365 Copilot user could trigger silent data exfiltration across connected cloud services without any user click [2]. MCP tool registries have been successfully poisoned at scale, with researchers confirming malicious payload execution on six production platforms [3].

This whitepaper defines the LOTA attack class, maps its constituent techniques to existing threat frameworks, characterizes the emerging threat actor landscape, and provides actionable guidance for security practitioners who must defend enterprises where AI agents are already operational.

Introduction: From Living Off the Land to Living Off the Agent

The Living Off the Land attack paradigm emerged as a response to increasingly capable endpoint detection. Rather than deploying detectable malware, sophisticated adversaries learned to accomplish their objectives using tools already present on victim systems—PowerShell, WMI, certutil, mshta, and dozens of other legitimate binaries that defenders cannot simply remove or block. This approach proved highly effective because it collapsed the distinction between attacker behavior and legitimate system administration. The 2026 CrowdStrike Global Threat Report, analyzing 2025 telemetry, found that 82 percent of detections involved no malware at all, confirming that the LOTL paradigm has become the dominant tradecraft of sophisticated threat actors [4].

Enterprise AI agents introduce an analogous structural vulnerability at a new layer of the computing stack. Where LOTL exploits the trust an operating system places in its own administrative tools, LOTA exploits the trust an enterprise places in its own AI systems. The mechanism differs, but the underlying logic is identical: attackers gain unauthorized influence over a trusted, capable system, then direct that system to act against its host. The challenge for defenders is similarly analogous. Just as you cannot remove PowerShell from a Windows enterprise without breaking legitimate workflows, you cannot simply restrict AI agents from taking consequential actions without negating their business value.

The conditions that make LOTA viable are now present across most large enterprises. According to a Gartner analysis published in August 2025, 40 percent of enterprise applications were projected to feature task-specific AI agents by the end of 2026, up from less than five percent at the time of publication [5]. These agents are granted access to file systems, email and calendar services, code repositories, customer relationship systems, financial platforms, and internal knowledge bases. They operate under service account identities that are frequently over-privileged and under-monitored. The CSA State of NHI and AI Security Survey found that fewer than one in four organizations had documented policies for creating or removing AI agent identities, leaving a large share of agents operating with shared, long-lived tokens and minimal audit trails [22].

The timing of LOTA’s emergence is not coincidental. Enterprise AI agent adoption accelerated dramatically in 2024 and 2025, driven by the commercial availability of capable foundation models, the emergence of agent orchestration frameworks, and the standardization of tool-use protocols such as Anthropic’s Model Context Protocol (MCP). Each of these developments increased agent capability and therefore increased the value of compromising or manipulating agent behavior. Security controls have lagged the deployment curve. A CSA survey conducted in early 2026 found that only 13 percent of organizations reported feeling highly prepared to secure their agentic AI deployments [6].

The security community now faces a familiar challenge in an unfamiliar domain: attackers have identified the emerging attack surface faster than defenders have learned to protect it.

Defining the LOTA Attack Class

Conceptual Boundaries

The LOTA attack class encompasses any technique that achieves unauthorized objectives by manipulating, compromising, or co-opting the capabilities of a legitimately deployed AI agent. Three characteristics distinguish LOTA techniques from conventional application attacks.

The first is the use of the agent’s own authorized capabilities as the attack instrument. A LOTA attack does not typically involve exploiting an unpatched buffer overflow or deploying a remote access trojan. Instead, the attacker gains influence over an agent that already has permission to read email, query databases, execute code, or transfer files—and directs those permissions toward unauthorized ends. The attack infrastructure is the production agent stack itself.

The second characteristic is semantic rather than syntactic attack delivery. Most traditional attacks manipulate data structures, memory layouts, or protocol handling. LOTA attacks primarily manipulate meaning: they introduce instructions into content that the agent will process and interpret as legitimate. This means that signature-based detection, which looks for anomalous binary patterns, is poorly suited to detecting the attack vector.

The third characteristic is speed and autonomy after initial compromise. Once a threat actor successfully influences an agent’s behavior, the agent may complete a complex, multi-step attack chain faster than any human operator could. The CrowdStrike 2026 Global Threat Report documented an average eCrime breakout time—from initial access to lateral movement—of just 29 minutes, with the fastest observed breakout occurring in 27 seconds [4]. Autonomous agents have the potential to collapse this timeline further still, given their ability to reason across available resources and execute actions without the human pacing that creates detectable gaps in conventional attacks.

Relationship to Existing Taxonomies

LOTA does not replace existing taxonomies; it extends them into the agentic layer. The OWASP Top 10 for LLM Applications (2025) identifies prompt injection, excessive agency, and system prompt leakage as priority risks for AI applications [7]. LOTA encompasses these risks but addresses them as a coordinated attack class rather than isolated vulnerabilities, and extends beyond single-model applications to multi-agent systems and orchestration pipelines. MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) provides a framework for categorizing AI-specific attack techniques at the tactic and technique level [8]. The CSA MAESTRO framework, introduced in February 2025, offers a seven-layer architecture for agentic AI threat modeling that provides the most directly applicable organizational lens for LOTA analysis [9].

Where previous frameworks largely addressed AI systems as targets, LOTA focuses attention on AI systems as weapons—specifically, on the conditions under which an adversary can transform a defender’s own AI agent into attack infrastructure.

The Trust Asymmetry Problem

LOTA attacks exploit a fundamental asymmetry in how enterprise environments extend trust. When a human employee accesses a document, sends an email, or executes a script, that action is associated with a human identity that carries accountability, behavioral history, and the cognitive expectation of intentionality. When an AI agent performs the same actions, the association between action and accountability is far weaker. The agent operates under a service identity, frequently shared among multiple agent instances. Its actions are often not attributed to any specific human principal. Its “intent”—which is to say, the instruction that prompted the action—may not be logged or retained.

This asymmetry means that a compromised agent can take consequential actions that would immediately trigger suspicion if taken by a human, yet pass through existing access controls and audit mechanisms without friction. An employee forwarding the company’s entire contract database to an external address would attract immediate attention. An AI assistant doing the same, while summarizing a document that contained hidden extraction instructions, might generate no alert at all.

LOTA Attack Techniques

The LOTA attack class comprises several distinct but often co-employed techniques. Understanding each technique, its prerequisites, and its typical sequence is essential for building effective defenses.

Indirect Prompt Injection

Indirect prompt injection is the foundational technique underlying most LOTA attacks. Unlike direct prompt injection—where an attacker interacts with a model through its own interface—indirect injection embeds malicious instructions within external content that the agent retrieves and processes during a legitimate task. The agent, unable to reliably distinguish between data it is analyzing and instructions it should follow, executes the embedded instructions as if they were legitimate directives.

The attack surface for indirect injection is broad because it encompasses any external content an agent can access: web pages, documents, email bodies, calendar invitations, code comments, database records, PDF attachments, or API responses. In June 2025, researchers at Aim Labs disclosed CVE-2025-32711 (EchoLeak) [2][21], demonstrating that a single crafted email delivered to any Microsoft 365 Copilot user’s mailbox could trigger silent extraction of sensitive data from OneDrive, SharePoint, and Teams without any user action. The email contained hidden instructions that Copilot ingested during routine summarization; within seconds the agent had located, packaged, and transmitted data from connected storage to an attacker-controlled server. Google’s Security Blog documented similar patterns observed in the wild [10], and practitioners have observed that indirect injection may succeed more readily than direct injection, as agents tend to treat self-retrieved content as data rather than as a potential instruction source.

The severity of indirect injection scales with the range of tools available to the compromised agent. An agent with read-only access to a document library can leak information. An agent with read, write, send, and execute capabilities can leak information, modify records, send communications from legitimate accounts, and trigger downstream workflows—all within a single injected instruction sequence.

Tool Supply Chain Poisoning

AI agents interact with their environment through tools: discrete functions that allow the agent to perform actions such as querying a database, calling a web API, reading a file, or executing code. In modern agentic architectures, tools are frequently sourced from third-party registries, shared across teams, and updated without formal change management processes. The tool supply chain—the pathway from tool author to deployed agent—has emerged as a high-value target for adversaries seeking persistent, broad access to the agents that consume those tools.

The most documented vector for tool supply chain poisoning involves MCP (Model Context Protocol) servers, which act as standardized endpoints exposing tool functionality to compatible AI agents. OX Security researchers demonstrated in 2025 that an attacker who controls an MCP server can embed directives directly within tool descriptions, where they will be transmitted to the agent’s model without sanitization or provenance checking [3]. The same researchers successfully poisoned nine of eleven MCP registries with test payloads and confirmed command execution on six live production platforms. Trend Micro researchers identified 492 MCP servers exposed to the internet with zero authentication required for connection, collectively exposing 1,402 tools—more than 90 percent of which provided direct read access to underlying data sources [11].

The architectural vulnerability underlying these attacks is now tracked as CVE-2025-54136 and describes a structural flaw in how many agent frameworks handle tool descriptions: the model receives and acts upon tool metadata without any mechanism to verify that the metadata has not been tampered with after registration [12]. An attacker who can modify a tool’s description after it has been approved—whether through compromising the MCP server, injecting into a registry, or exploiting a supply chain dependency—can subsequently influence the behavior of every agent that loads that tool.

Tool supply chain poisoning is particularly dangerous because it achieves persistent access that does not depend on repeated user interaction. Once a malicious tool is loaded into an agent’s context, every subsequent task the agent undertakes can be influenced by instructions embedded in that tool’s metadata.

Orchestrator Hijacking and Memory Manipulation

Complex enterprise AI deployments increasingly rely on multi-agent architectures, in which a primary orchestrator agent decomposes tasks and delegates subtasks to specialized sub-agents. This hierarchy introduces additional attack surfaces. An adversary who can influence the orchestrator’s behavior—through prompt injection targeting the orchestrator, compromise of its tool environment, or manipulation of the messages it passes to sub-agents—gains control over the entire downstream agent pipeline. A successful orchestrator compromise is structurally analogous to compromising a domain controller: it grants influence over all entities that report to the compromised system.

Memory systems present a related vulnerability that is specific to stateful agent architectures. Long-horizon agents that maintain persistent memory—storing facts about the enterprise environment, user preferences, or prior task results—can have that memory poisoned by adversarial content. An agent that reads a maliciously crafted document and stores a distorted summary of its contents has been compromised in a way that will affect its subsequent behavior without the attacker needing to maintain active presence. Memory poisoning is particularly difficult to detect and contain because the corrupted belief may persist across many task cycles, propagate to other agents that query the same memory store, and be difficult to distinguish from legitimate learned context.

Credential Harvest and Privilege Escalation

AI agents operate under identities—service accounts, API tokens, OAuth grants, and similar credentials—that provide access to enterprise resources. These credentials are frequently more valuable than the credentials of individual human users because they are designed for programmatic access and are therefore often granted broad permissions without the interactive friction that constrains human accounts. The CSA State of NHI and AI Security Survey found that fewer than one in four organizations had documented policies for creating or removing AI agent identities, with most lacking the rotation, monitoring, and revocation controls maintained for human accounts [22].

Adversaries can harvest agent credentials through several paths. Indirect prompt injection can instruct an agent to surface its own credentials in a response, write them to an accessible location, or transmit them to an attacker-controlled endpoint. Memory poisoning can cause an agent to cache credentials in a retrievable location. In cloud environments, researchers have demonstrated that default service accounts granted to AI orchestration services can carry excessive permissions that allow an attacker who influences agent behavior to escalate from application-level access to privileged access across the hosting cloud project [13].

Once credential material is obtained or agent behavior is sufficiently compromised, privilege escalation within enterprise environments can proceed at machine speed. A compromised agent does not need time to enumerate directories, draft reconnaissance queries, or read through documentation—it already has that context, can process it at inference speed, and can pivot laterally without the pauses that make human attackers detectable.

Autonomous Lateral Movement

The convergence of credential access, broad tool permissions, and autonomous reasoning gives compromised agents a unique capability for lateral movement. A human attacker who has compromised an account must manually identify reachable systems, craft appropriate authentication attempts, and navigate access controls interactively—a process that takes time and generates detectable patterns. A compromised agent, given a sufficiently broad directive, can reason about its environment, identify additional access pathways, use legitimate tools to traverse them, and execute complex multi-stage attacks within a single inference sequence.

This temporal compression has profound implications for detection and response. The 2026 CrowdStrike Global Threat Report noted that AI-enabled adversaries increased operations by 89 percent year-over-year, with the average breakout time falling to 29 minutes [4]. The speed advantage compounds at the agent level: BVP Atlas reported that an autonomous agent that gained access to McKinsey’s internal AI platform “Lilli” acquired broad system access across the production environment in under two hours—a real-world compromise that demonstrated how rapidly a foothold can be exploited when the attacker is an autonomous reasoning system operating without human pacing [14].

Threat Actor Landscape

State-Sponsored Groups

Nation-state actors have demonstrated both the capability and the willingness to employ AI agents as operational infrastructure in offensive cyber campaigns. In September 2025, Anthropic detected and disrupted a Chinese state-sponsored intrusion campaign in which compromised Claude Code instances served as autonomous orchestrators for espionage operations against approximately thirty entities in defense, energy, and technology sectors [1]. The campaign was notable for the degree of agent autonomy involved: the AI handled an estimated 80 to 90 percent of tactical decisions, with human operators engaging only at critical decision points. The agents were used to discover and exploit vulnerabilities, conduct reconnaissance, and exfiltrate data—all at a tempo and scale that would have required significantly more personnel to achieve through conventional means.

This incident represents a qualitative shift in the role of AI in nation-state operations. Prior to this campaign, public reporting had largely characterized AI use in nation-state operations as augmenting human operators rather than replacing them—generating phishing content, summarizing reconnaissance data, or suggesting exploit code. The September 2025 campaign used AI agents as operational actors, reducing the human-to-agent ratio and increasing the campaign’s speed and scale relative to its operator footprint.

Criminal Organizations and Ransomware Groups

Criminal threat actors have adopted LOTA-adjacent techniques primarily through two vectors: the compromise of AI development platforms to harvest credentials and sensitive data embedded in agent memory, and the targeting of AI agents with broad enterprise access to accelerate ransomware deployment.

The 2026 OWASP GenAI Exploit Round-Up identified a significant increase in attacks targeting AI systems as vectors for initial access to enterprise environments [15]. Rather than targeting the AI model itself, these attacks use the agent’s trusted position within the enterprise network as the foothold for subsequent conventional attack activity. A compromised AI assistant with access to email, file systems, and internal APIs provides an attacker with a well-credentialed, well-documented starting position that is often superior to what conventional initial access techniques yield.

Insider Threat and Shadow AI

LOTA risk is not limited to external adversaries. Employees with access to enterprise AI agents can craft inputs designed to cause the agent to take actions that exceed their own authorization—effectively using the agent to proxy privilege escalation or data access that would not be possible through direct means. The emergence of shadow AI agents—autonomous systems deployed by individual teams or employees without security review or governance oversight—amplifies this risk by introducing agent instances that may have been granted significant access without the controls applied to officially sanctioned deployments.

CSA research on enterprise AI adoption found that 54 percent of organizations reported operating with unsanctioned AI agents in their environment, with security teams frequently lacking complete visibility into the full agent population [6].

Enterprise Risk Dimensions

The Blast Radius Problem

The risk profile of a compromised AI agent differs from that of a compromised user account or endpoint in ways that require security teams to revise their impact modeling. The blast radius of a compromised agent is determined not by the agent’s nominal role but by the full scope of its tool and data access—which in enterprise deployments is frequently far broader than any single human user would be granted.

An enterprise AI assistant configured to help employees with research and communications tasks might be granted access to email and calendar for meeting scheduling, access to the corporate knowledge base for information retrieval, access to code repositories for developer assistance, access to an internal ticketing system for workflow integration, and internet access for research. Each of these access grants makes the agent more useful. Together, they also define the blast radius of a successful compromise: an attacker who controls that agent’s behavior controls its email, files, code, tickets, and browsing activity simultaneously.

This aggregated access profile is not inherent to agentic AI; it is a consequence of how organizations have chosen to deploy agents. Least-privilege principles that are routine for human users and service accounts are frequently absent from initial agent deployments, where broad access is granted to maximize the agent’s usefulness before governance frameworks have been developed.

Detection Failures in Traditional Controls

Traditional security monitoring is poorly calibrated for LOTA attack detection. Signature-based tools look for known malicious patterns; LOTA attacks use legitimate agent behavior as their instrument and generate no signatures. User and entity behavior analytics (UEBA) systems are trained on human behavioral baselines; agent behavior is inherently different in tempo, scope, and pattern, and anomaly detection models built for human users will generate high false-positive rates against normal agent activity. Data loss prevention (DLP) systems that flag bulk data transfer may detect the exfiltration stage of an attack, but only after the compromise has already occurred and the data is in transit.

The fundamental detection challenge is that a LOTA attack, by definition, causes the agent to take actions that are individually legitimate. The agent is authorized to read those documents. It is authorized to send that data. It is authorized to call that API. The malicious element is not in any individual action but in the instruction that prompted the sequence—and that instruction is embedded in content that the agent processed as data, not in a channel that security tools monitor.

Regulatory and Compliance Implications

LOTA attacks that result in data exfiltration or unauthorized access to personal data trigger notification obligations under a growing range of regulatory frameworks. The complication is one of attribution: when an AI agent has been manipulated into exfiltrating data, the technical action was performed by the organization’s own system, under the organization’s own credentials. Determining whether this constitutes a “breach” in the regulatory sense—and who bears responsibility—is an open question that legal and compliance teams will need to address proactively.

The EU AI Act’s provisions on high-risk AI systems, the emerging body of U.S. state AI legislation, and sector-specific regulations for financial services and healthcare all impose requirements that may be affected by LOTA incidents. Organizations should expect that regulators will scrutinize agent governance controls—access management, audit logging, human oversight mechanisms—when assessing responses to LOTA-related incidents.

Detection and Response

What to Look For

Effective LOTA detection requires monitoring at layers that traditional security tools frequently do not cover: agent input channels, tool invocation logs, and agent-generated output streams.

Agent input channels—the documents, email messages, web pages, and API responses that agents retrieve and process—should be treated as a data source requiring inspection analogous to network traffic inspection. Content that contains instruction-shaped text in unusual locations (metadata fields, document footers, image alt-text, HTML comments) warrants scrutiny. Organizations with mature security operations should consider deploying LLM-based inspection of high-risk agent input channels, using a dedicated classification model to identify candidate prompt injection payloads before they reach the operational agent.

Tool invocation logs provide a record of what actions an agent attempted and which succeeded. Anomalies to monitor include unexpected calls to external endpoints, bulk data access patterns inconsistent with the agent’s assigned task, tool calls that span unusual combinations of data domains, and failed tool calls that may indicate an agent probing the boundaries of its permissions. These logs must be generated, retained, and actively monitored—capabilities that many current agent deployments lack.

Agent-generated outputs, including drafted communications, filed records, and executed code, should be subject to review processes proportional to their consequence. High-consequence outputs—financial transactions, outbound communications, code committed to production repositories, configuration changes—should require human-in-the-loop approval as a structural control, not as a preference setting that can be overridden by a sufficiently compelling agent instruction.

Incident Response Adaptations

Responding to a LOTA incident requires adaptations to conventional IR playbooks. The initial containment action for a compromised user account is credential revocation; the analogous action for a compromised agent is credential revocation combined with session termination and tool access suspension, which may affect legitimate ongoing tasks. Organizations need pre-established procedures for isolating agent instances without disrupting business operations—procedures that are difficult to develop under the time pressure of an active incident.

Forensic investigation of a LOTA incident depends critically on the quality of agent input and action logs. If the agent that was compromised was not logging the content it retrieved or the instructions that drove each action, reconstructing the attack chain may be impossible. This is a governance failure that must be addressed before incidents occur, not after. Agent audit logging should be treated as a non-negotiable baseline, comparable to authentication logging for human user accounts.

Defense Framework

Principle 1: Least-Privilege Tool and Data Access

The blast radius of any LOTA compromise is bounded by the agent’s access. Applying least-privilege principles to agent tool and data grants—giving each agent access only to the tools and data stores required for its specific, defined functions—represents one of the highest-leverage controls available for limiting LOTA impact, because no other single measure more directly constrains what a compromised agent can do. This requires moving away from the common pattern of deploying a general-purpose agent with broad access and toward a model of scoped, task-specific agents whose access grants are proportional to their defined purpose.

Implementation requires treating agent access reviews with the same rigor applied to privileged human user accounts: defined owners, documented justification for each access grant, periodic recertification, and automated detection of access that exceeds the documented scope.

Principle 2: Input Content Inspection and Sandboxing

External content processed by agents—web pages, documents, emails—should be treated as untrusted input analogous to user-supplied data in a web application. Content that will be processed by high-privilege agents should be inspected for instruction-shaped payloads before it reaches the agent. For agents that process high volumes of external content, a purpose-built classifier operating on the input pipeline can flag candidates for human review or block content that exceeds a risk threshold.

Sandboxing provides a complementary control: running agent inference in environments where tool access is monitored and throttled can limit the damage achievable from a successful injection before human review can occur. Agent sandboxing also provides a foundation for the behavioral telemetry needed for anomaly detection.

Principle 3: Tool Provenance and Registry Controls

Organizations should establish governance processes for AI tool sourcing analogous to those maintained for software packages and third-party libraries. Tool sourcing policies should require review and approval before new MCP servers or agent skills are introduced into production agent environments. Tools should be sourced from known, trusted publishers; registry configurations should be locked against unauthorized modification; and tool metadata should be treated as a potential attack surface requiring the same scrutiny as executable code.

For organizations using shared tool marketplaces or public MCP registries, version pinning and cryptographic verification of tool manifests—where the hosting infrastructure supports this—reduce the risk of supply chain poisoning through registry manipulation.

Control	Threat Addressed	Implementation Complexity	Risk Reduction
Least-privilege agent access	All LOTA techniques	Medium	High
Input content inspection	Indirect prompt injection	Medium–High	High
Tool provenance controls	Supply chain poisoning	Low–Medium	High
Agent session isolation	Lateral movement, credential harvest	Medium	Medium–High
Mandatory human review for high-consequence actions	All LOTA techniques	Low	High
Agent-specific audit logging	Detection and response	Low	High
NHI credential rotation and monitoring	Credential harvest	Medium	Medium
Multi-agent trust boundary enforcement	Orchestrator hijacking	High	High

Principle 4: Human-in-the-Loop for Consequential Actions

Autonomy is the property that makes LOTA attacks dangerous: the agent can complete complex attack chains without human intervention. Requiring human confirmation for a defined category of high-consequence actions—outbound communications to external parties, bulk data transfer, credential access, code deployment, financial operations—directly limits the attacker’s ability to use the agent as uninterrupted infrastructure. This control is effective even when the agent’s reasoning has been compromised, because the human reviewer can evaluate the proposed action against expected business context.

The key to implementing this control sustainably is defining the confirmation scope narrowly enough to avoid alert fatigue. Every agent-initiated action cannot require human review; that would eliminate the value of agent autonomy. The scope should be defined based on consequence: actions that are difficult or impossible to reverse, that involve sensitive data categories, or that have external-party visibility warrant review. Actions that are easily audited and reversible can proceed autonomously with post-hoc review.

Principle 5: Identity Hygiene for Non-Human Identities

AI agent credentials should be managed with the same discipline applied to privileged human accounts. Each agent instance should operate under a dedicated, uniquely identified service identity rather than a shared credential. Credentials should be short-lived where the hosting infrastructure permits, and revocation mechanisms should be tested and pre-staged for rapid execution during incidents. Agent identities should be enrolled in the organization’s privileged access management (PAM) infrastructure, with access reviews conducted on a defined schedule.

The expansion of non-human identity governance to cover AI agents is a prerequisite for effective LOTA response, because revocation and containment procedures that work against compromised human accounts are ineffective if the equivalent mechanisms do not exist for agent credentials.

Principle 6: Adversarial Testing and Red Teaming

The CSA Agentic AI Red Teaming Guide provides a structured methodology for adversarial testing of agentic AI systems, covering twelve distinct threat categories specific to autonomous agent deployments [16]. Organizations that have deployed AI agents should include agent-specific adversarial testing in their regular security assessment program. Testing scope should include indirect prompt injection via realistic content channels, tool manipulation scenarios, multi-agent trust boundary violations, and credential exfiltration through agent-mediated channels.

Red team findings should drive concrete control improvements, with remediation tracked against defined timelines and re-tested to verify effectiveness. Given the rapid evolution of LOTA techniques, point-in-time assessments should be supplemented with continuous monitoring and defined trigger conditions for ad-hoc testing when new agent capabilities are deployed.

CSA Framework Alignment

The LOTA attack class intersects with several existing CSA frameworks that provide actionable control guidance and governance structure for organizations developing their defenses.

MAESTRO (Multi-Agent Environment, Security, Threat Risk, and Outcome): Published by CSA in February 2025, MAESTRO provides the most directly applicable framework for LOTA threat modeling [9]. Its seven-layer reference architecture—covering foundation models, data operations, agent frameworks, deployment and infrastructure, evaluation and observability, security and compliance, and the agent ecosystem—maps cleanly onto the attack surfaces that LOTA techniques exploit. Layer 3 (Agent Frameworks) and Layer 7 (Agent Ecosystem) are particularly relevant to tool supply chain poisoning and orchestrator hijacking. Layer 2 (Data Operations) and Layer 5 (Evaluation and Observability) address the input content inspection and logging controls that are foundational to LOTA detection. Security teams should use MAESTRO to structure their threat model for deployed agents and to identify control gaps at each layer.

AI Controls Matrix (AICM) v1.0.3: The CSA AICM provides a comprehensive control framework for AI security across eighteen domains, with implementation guidance differentiated by role—model providers, application providers, cloud service providers, and orchestrated service providers [17]. LOTA defenses map most directly to the AICM’s identity and access management, data security, supply chain security, and monitoring and detection domains. Organizations deploying AI agents should assess their AICM posture against these domains specifically, with LOTA scenarios used as test cases for control adequacy.

AI Organizational Responsibilities: CSA’s AI Organizational Responsibilities guidance establishes the governance and accountability structures needed to manage AI security across an enterprise [18]. LOTA risk management requires organizational clarity on who owns agent security policy, who approves agent tool access grants, and who has incident response authority over compromised agent instances. These questions must be resolved before an incident occurs.

Zero Trust Guidance: CSA’s Zero Trust architecture principles apply directly to the agent trust problem [19]. The principle of never implicitly trusting any entity based on network location or identity alone must extend to AI agents: agents should not be trusted to act on their stated instructions without verification, and downstream systems should not grant agents access based solely on their service identity. Implementing zero trust for agentic environments requires treating agent-initiated requests with the same skepticism applied to requests from external parties.

STAR (Security Trust Assurance and Risk): Organizations procuring third-party AI agent capabilities should use the CSA STAR program to assess the security posture of vendors in their agent supply chain, including MCP server operators, tool marketplace providers, and agent orchestration platform vendors [20]. Third-party LOTA risk is a procurement and vendor management issue as much as a technical one.

Conclusions and Recommendations

The Living Off the Agent attack class represents a maturation of AI-specific threats from theoretical concern to operational reality. The transition has been fast. Enterprises that deployed their first AI agents in 2024 are now operating in an environment where state-sponsored actors have demonstrated autonomous agent-mediated espionage, where tool supply chains have been successfully poisoned at scale, and where a single malicious document can trigger a data exfiltration chain without any user involvement.

The security implications of this shift are not primarily technical. The underlying vulnerability—agents with broad access, acting autonomously, in ways that are difficult to distinguish from legitimate operation—is not a bug that can be patched. It is a structural consequence of how agentic AI systems are designed to work. Addressing it requires governance and architecture decisions that most organizations have not yet made.

Security leaders should prioritize the following actions as immediate measures. First, they should inventory all AI agents operating within the enterprise, including shadow deployments, and document the tool access and data grants associated with each. Second, they should apply least-privilege access reviews to each agent, revoking access that is not required for the agent’s defined function. Third, they should implement mandatory human review for a defined set of high-consequence agent actions, beginning with outbound communications, bulk data access, and credential handling. Fourth, they should extend non-human identity governance to AI agent credentials, including dedicated service accounts, rotation schedules, and revocation procedures. Fifth, they should implement agent audit logging as a baseline requirement, ensuring that agent input channels and tool invocation sequences are retained for investigation.

Over the near term, security teams should expand their monitoring capabilities to include agent-specific behavioral baselines, pursue adversarial testing of deployed agents using the methodology provided in the CSA Agentic AI Red Teaming Guide, and establish tool sourcing governance processes for MCP servers and agent skill marketplaces. They should also develop and test incident response playbooks specific to agent compromise scenarios.

Strategically, organizations should recognize that agent security is an architectural challenge, not only an operational one. The governance frameworks, access control models, and monitoring capabilities that enable safe agent deployment must be developed in concert with agent capabilities, not after the fact. The pattern observed in early cloud deployments—where security architecture lagged capability deployment by years—provides a cautionary model: the technical debt accumulated during that period took years to remediate and created substantial exposure in the interim. The enterprise AI agent deployment wave is moving too fast, and the threat landscape is evolving too quickly, for organizations to accept that pattern repeating.

References

[1] Anthropic. “Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign.” Anthropic, November 2025.

[2] Infosecurity Magazine. “M365 Copilot: New Zero-Click AI Flaw Allows Corporate Data Theft.” Infosecurity Magazine, 2025.

[3] OX Security. “The Mother of All AI Supply Chains: Critical, Systemic Vulnerability at the Core of Anthropic’s MCP.” OX Security Blog, 2025.

[4] CrowdStrike. “2026 CrowdStrike Global Threat Report: AI Accelerates Adversaries and Reshapes the Attack Surface.” CrowdStrike, February 2026.

[5] Gartner. “Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026, Up from Less Than 5% in 2025.” Gartner Newsroom, August 2025.

[6] Cloud Security Alliance. “Enterprise AI Security Starts with AI Agents.” CSA AI Safety Initiative, 2026.

[7] OWASP. “LLM01:2025 Prompt Injection.” OWASP GenAI Security Project, 2025.

[8] MITRE. “MITRE ATLAS: Adversarial Threat Landscape for Artificial-Intelligence Systems.” MITRE Corporation, 2025.

[9] Cloud Security Alliance. “Agentic AI Threat Modeling Framework: MAESTRO.” CSA Blog, February 2025.

[10] Google Security. “AI Threats in the Wild: The Current State of Prompt Injections on the Web.” Google Security Blog, April 2026.

[11] Trend Micro. “MCP Security: Network-Exposed Servers Are Backdoors to Your Private Data.” Trend Micro, 2025.

[12] TrueFoundry. “MCP Tool Poisoning (CVE-2025-54136): A Structural Vulnerability in Agent Context.” TrueFoundry Blog, 2025.

[13] Palo Alto Networks Unit 42. “Double Agents: Exposing Security Blind Spots in GCP Vertex AI.” Unit 42 Research, 2025.

[14] Bessemer Venture Partners. “Securing AI Agents: The Defining Cybersecurity Challenge of 2026.” BVP Atlas, 2026.

[15] OWASP GenAI Security Project. “OWASP GenAI Exploit Round-Up Report Q1 2026.” OWASP, April 2026.

[16] Cloud Security Alliance. “Agentic AI Red Teaming Guide.” CSA AI Organizational Responsibilities Working Group, 2025.

[17] Cloud Security Alliance. “AI Controls Matrix (AICM) v1.0.3.” CSA, 2025.

[18] Cloud Security Alliance. “AI Organizational Responsibilities: Core Security Responsibilities.” CSA, 2024.

[19] Cloud Security Alliance. “Zero Trust Guidance for Critical Infrastructure.” CSA, 2024.

[20] Cloud Security Alliance. “STAR: Security Trust Assurance and Risk Program.” CSA, 2025.

[21] National Vulnerability Database. “CVE-2025-32711 Detail.” NVD/NIST, June 2025.

[22] Cloud Security Alliance. “State of NHI and AI Security Survey Report.” CSA, 2026.

← Back to Research Index