Agentic AI Governance Maturity Model

White Paper | 2026-03-27 | Status: draft

Agentic AI Governance Maturity Model

Executive Summary

The deployment of agentic AI systems across enterprise environments is accelerating faster than the governance frameworks needed to manage them. A 2025 joint study by the Cloud Security Alliance and Google Cloud found that only 26 percent of organizations reported having comprehensive AI security governance policies in place, while a companion survey focused specifically on agentic deployments found that 84 percent of organizations could not pass a compliance audit focused on agent behavior or access controls, and only 23 percent had a formal agent identity strategy in place [1][2]. These figures describe an industry in the early stages of confronting a governance challenge for which it has not yet developed consistent, scalable practices.

The Agentic AI Governance Maturity Model (AGMM) introduced in this whitepaper addresses that gap directly. Unlike the high-level AI governance principles that have dominated policy discussion, the AGMM is an operational framework: a five-level progression model that describes what governance capability actually looks like at each stage of organizational development, what distinguishes one level from the next, and what specific investments — in controls, processes, tooling, and culture — are required to advance. The model draws its structural logic from CMMI’s well-established five-level maturity architecture (Initial through Optimizing), adapts it to the specific control surface of agentic AI systems, and aligns it with the CSA AI Controls Matrix (AICM), STAR for AI, NIST CSF 2.0 implementation tiers, and ISO/IEC 42001 clauses [3][4][5][6].

The five levels of the AGMM are: Level 1 (Ad-Hoc), characterized by the absence of any formal agent governance and reliance on individual judgment; Level 2 (Developing), where basic policies and reactive management practices have been established but remain inconsistent; Level 3 (Defined), where a documented governance framework with proactive controls is consistently applied across the enterprise; Level 4 (Managed), where governance is measured and managed quantitatively with risk data driving decisions; and Level 5 (Optimizing), where continuous improvement, predictive risk management, and automated adaptive controls define the governance posture.

The model assesses maturity across seven dimensions: agent identity governance, runtime behavioral controls, tool and capability management, human oversight mechanisms, incident response readiness, compliance posture, and workforce capability. A 22-question self-assessment questionnaire maps to these dimensions and produces a scored profile that locates an organization within the maturity model. The body of this document provides the definitional content needed to use that questionnaire as more than a checkmark exercise: each level is described in sufficient depth that organizations can ground their self-assessment in concrete capability evidence rather than subjective perception.

The AGMM is calibrated to the current state of the agentic AI landscape. Gartner has projected that 40 percent of enterprise applications will feature task-specific AI agents by 2026, up from less than 5 percent in 2025 [7]. Organizations that have not yet established formal governance are not fringe cases; they are the majority. This model is designed to be useful at Level 1 as much as at Level 4, providing a credible starting point for organizations that are honestly at the beginning of this journey while offering the measurement and improvement machinery that sophisticated governance programs require at scale.


1. Introduction: The Need for a Governance Maturity Model

1.1 The Agentic Inflection Point

The term “agentic AI” refers to AI systems that do not merely respond to queries but pursue goals — decomposing objectives into subtasks, invoking tools, spawning sub-agents, and taking actions with real-world consequences across extended workflows that may span hours or days without human review of individual steps. This architectural shift from interactive to autonomous AI creates a governance challenge that differs from conventional AI oversight in several fundamental respects. The action surface of an agentic system extends across every tool and API it can reach. Its decisions compound through multi-step reasoning chains that may be difficult to trace or explain post hoc. Its identity, when interacting with external systems, may be difficult to distinguish from a human operator or a trusted service account. And its failure modes — goal drift, tool misuse, prompt injection, cascading errors across connected services — can propagate faster than human responders can detect.

These properties do not make agentic AI ungovernable. They do mean that governing it requires capabilities that most organizations have not yet built: agent-specific identity lifecycle management, runtime behavioral monitoring tuned to detect autonomous goal deviation, tool access governance that accounts for dynamic capability extension, and human oversight mechanisms that provide meaningful control without collapsing the operational advantages of automation. The CSA AI Controls Matrix v1.0 provides 243 control objectives across 18 domains that establish the foundations of AI governance, and the AICM’s Agentic AI (AA) domain addresses agent-specific risks directly [5]. What has been missing is a framework for assessing how well organizations have implemented these controls in practice — and for providing structured guidance on what to prioritize next.

1.2 Limitations of Existing Governance Instruments

Existing AI governance frameworks provide excellent guidance on what good governance looks like in a mature state, but offer limited practical help to organizations at the beginning of the journey. ISO/IEC 42001 specifies requirements for an AI Management System, providing auditable clauses covering context, leadership, planning, operations, performance evaluation, and continuous improvement [6]. It does not, however, describe the intermediate organizational states through which an enterprise moves in building toward that standard. NIST CSF 2.0’s implementation tiers — Partial, Risk Informed, Repeatable, and Adaptive — describe increasing levels of governance sophistication at the enterprise level but are not calibrated to the specific characteristics of agentic AI systems [4]. The CSA’s STAR for AI program establishes a two-level certification structure where Level 1 requires a self-assessed AI-CAIQ submission and Level 2 requires ISO/IEC 42001 certification combined with third-party audit [3]. These certification thresholds are valuable milestones, but the distance between a nascent governance program and a credible STAR Level 1 self-assessment is considerable and poorly mapped.

The AGMM fills this gap by providing a five-level progression that describes not only the endpoint states (basic transparency at Level 1, formal STAR certification at Level 3 and above) but the intermediate capabilities that organizations must develop to make meaningful progress. It treats governance maturity as a journey with observable waypoints rather than as a binary condition of certified or not-certified.

1.3 Relationship to CMMI

The CMMI (Capability Maturity Model Integration) has been the dominant framework for assessing organizational process maturity for more than three decades [8]. Its five-level structure — Initial, Managed, Defined, Quantitatively Managed, and Optimizing — has proven durable across domains from software engineering to supply chain management because it captures a genuine empirical regularity: organizations do in fact progress through these stages, and the transitions between levels correspond to identifiable investments in process discipline, measurement capability, and organizational learning. The AGMM adopts this five-level architecture and applies it to agentic AI governance. The level names are adapted (Ad-Hoc, Developing, Defined, Managed, Optimizing) to reflect the current state of the field, where most organizations are pre-Level 2 rather than pre-Level 4. The underlying logic — that governance capability builds progressively, that each level requires the foundations of the previous level, and that quantitative management must precede continuous optimization — is preserved intact.


2. Five Maturity Levels

Level 1: Ad-Hoc — Unmanaged Agentic Deployment

Organizations at Level 1 deploy agentic AI systems without formal governance structures specifically designed for autonomous agents. AI deployments at this level may have some legacy IT governance applied — change management tickets, general access control policies, incident logging — but these controls were designed for conventional software systems and do not account for the distinctive properties of agents: autonomous goal pursuit, dynamic tool invocation, ephemeral sub-agent creation, and the generation of consequential side effects without real-time human approval. The defining characteristic of Level 1 is not the complete absence of any process, but the reliance on individual judgment and tribal knowledge rather than documented, repeatable governance practice.

At Level 1, agent deployments are typically initiated by individual teams or business units responding to immediate operational needs, without coordination through a central AI governance body. Agents may be deployed under human user credentials rather than dedicated service identities, making attribution and auditing unreliable. Access scoping is coarse — agents are given broad permissions sufficient to perform their primary tasks without detailed analysis of minimum-necessary privilege. There are no runtime behavioral baselines against which to detect anomalous agent behavior; monitoring, where it exists, is the general-purpose logging infrastructure applied to all systems. When incidents occur, they are handled as one-off events with no systematic process for determining root cause, extracting lessons, or updating governance practices.

The organizational posture at Level 1 is reactive. Problems are addressed when they surface, not anticipated through structured risk analysis. Governance decisions about agentic AI — what agents can access, what they can do, who is accountable when something goes wrong — are made informally, often by the engineers building and operating the agents rather than by anyone with explicit governance responsibility. Workforce capability with respect to agentic AI risk is low or uneven; most staff lack training on the distinctive threats these systems present, and security teams have limited familiarity with agentic architectures, prompt injection risks, or inter-agent communication vulnerabilities.

Organizations at Level 1 exhibit several observable indicators that distinguish them from Level 2. Agent inventories either do not exist or are incomplete and unmaintained. There is no owner of record for agentic AI governance — no committee, role, or policy that explicitly claims responsibility. Policy documents covering AI use either do not mention autonomous agents or apply only generic software governance language to them. When asked about agent access credentials, teams cannot enumerate what systems individual agents can reach. Incident postmortems for AI-related events do not capture agent-specific causal factors.

The path forward from Level 1 does not require sophisticated tooling or large organizational investments. It requires acknowledgment that agentic AI presents governance requirements that existing IT governance does not adequately cover, and commitment to documenting the agents that exist, the people responsible for them, and the basic policies that should govern their operation. This acknowledgment — and the organizational ownership structure it requires — is the essence of the transition to Level 2.


Level 2: Developing — Basic Policies, Reactive Management

Level 2 organizations have recognized that agentic AI systems require specific governance attention and have begun building the foundational structures that make systematic governance possible. The defining characteristics of Level 2 are the existence of a basic agent inventory, the assignment of governance accountability to identifiable roles or committees, and the presence of written policies — however incomplete — that address agentic AI specifically rather than relying exclusively on general IT governance language applied by analogy.

Agent identity management at Level 2 is partially formalized. Most production agents operate under dedicated service accounts rather than human user credentials, though enforcement is inconsistent and legacy deployments may still use shared credentials. Access scoping follows documented policies requiring minimum-necessary privilege in principle, but policy application is manual and depends on individual teams following guidance rather than on automated enforcement. Agent inventories exist and are maintained through periodic review cycles, typically quarterly or at deployment time, though real-time accuracy cannot be guaranteed. Change management processes require that new agent deployments pass a security review before production promotion, but the review criteria are general and do not consistently evaluate agent-specific risks such as tool access scope, sub-agent spawning policies, or prompt injection exposure.

Human oversight at Level 2 is primarily pre-deployment: organizations review agent designs and access scopes before deployment but have limited visibility into agent behavior during operation. Runtime behavioral monitoring relies on general log analysis and threshold-based alerting rather than agent-specific behavioral baselines. When anomalies occur, they are typically detected through user reports or downstream effects rather than through proactive monitoring. Incident response for AI-related events follows general IT incident response procedures, with post-incident reviews conducted on a case-by-case basis. Lessons learned may be documented but are not systematically fed back into governance policies or risk frameworks.

Compliance posture at Level 2 reflects awareness of relevant regulatory and framework requirements without comprehensive coverage. Organizations have likely reviewed the AICM or NIST AI RMF at a high level and mapped their existing controls against framework requirements, but gaps are significant and closure timelines are not committed. STAR for AI Level 1 certification may be aspirational but has not been formally pursued; if an AI-CAIQ self-assessment has been submitted, it was likely produced with limited systematic evidence collection and reflects acknowledged uncertainty across many control areas.

The transition from Level 2 to Level 3 is the most significant governance transformation in the maturity model. It requires moving from reactive, policy-driven governance to proactive, system-driven governance: replacing manual enforcement with automated controls, replacing periodic review with continuous monitoring, and replacing individual accountability with organizational accountability structures that persist regardless of personnel changes. Organizations at the upper end of Level 2 typically begin this transition by piloting automated agent identity provisioning in one product area and expanding from there.


Level 3: Defined — Documented Framework, Proactive Controls

Level 3 represents the first level at which an organization can credibly claim to have a governance framework for agentic AI rather than a collection of governance activities. The defining characteristic of Level 3 is systematic consistency: controls are documented, consistently applied, technically enforced where possible, and owned by organizational structures that persist independent of the individuals currently operating within them. A Level 3 organization deploying a new agentic AI system follows a defined process that applies consistently regardless of which team is doing the deploying.

Agent identity governance at Level 3 is both policy-based and technically enforced. Every production agent has a dedicated, auditable identity that maps to a unique entry in the organization’s agent registry. Identity provisioning follows a documented lifecycle: creation with scoped credentials at deployment, regular credential rotation on defined schedules, and automated deprovisioning when agents are retired. Privilege scoping is enforced at the platform level — agents cannot acquire capabilities beyond their registered scope, and any attempt to do so generates an alert. Sub-agent spawning is governed by a policy that specifies which agents can create sub-agents, what capability inheritance rules apply to spawned agents, and what audit trail is required. The organization maintains a real-time or near-real-time agent inventory that is treated as a security-critical asset and is reconciled against the identity system on a defined schedule.

Runtime behavioral controls at Level 3 extend beyond basic logging to include agent-specific behavioral monitoring. Behavioral baselines are established for production agents, capturing normal patterns of tool invocation frequency, data volumes accessed, API call sequences, and task completion rates. Monitoring systems generate alerts when agent behavior deviates materially from these baselines, and a defined triage process routes anomaly alerts to the appropriate on-call team. Tool access is governed by a capability register that documents every tool and API accessible to each agent class, with access changes requiring a change management review that evaluates the security implications of the expanded capability surface. Organizations at Level 3 have implemented guardrail controls — input validation, output filtering, rate limiting, and session boundaries — as a standard layer of their agentic deployment architecture rather than as ad-hoc additions to individual deployments.

Human oversight mechanisms at Level 3 are structured and risk-differentiated. The organization maintains documented criteria for determining which agent decisions or actions require human approval before execution — a threshold policy that reflects the risk profile of different action categories (read-only versus write, reversible versus irreversible, low-stakes versus consequential) rather than applying a uniform oversight level to all agents. High-risk actions are routed to human reviewers through a defined workflow with documented escalation paths. Oversight coverage is tracked as a metric: the organization can report what percentage of high-risk agent actions received human review in a given period, and this metric is reviewed by governance owners on a regular basis.

Incident response at Level 3 is agent-aware. The incident response plan has been updated to address agent-specific scenarios — prompt injection attacks, credential theft through tool manipulation, sub-agent spawning anomalies, cascading failures across multi-agent workflows — with documented response playbooks for each category. Post-incident reviews capture agent-specific causal factors and produce formal recommendations that are tracked through to policy or control updates. The organization participates in threat intelligence sharing relevant to agentic AI, monitoring sources such as the OWASP Top 10 for Agentic Applications and MITRE ATLAS for emerging technique categories and updating its threat model accordingly.

Compliance posture at Level 3 is systematic and evidenced. The organization has completed a formal AICM self-assessment, mapped its control coverage across all 18 domains, and is actively remediating high-priority gaps on a tracked timeline. STAR for AI Level 1 certification has been achieved or is in final preparation. ISO/IEC 42001 implementation is underway, with Clauses 4 through 8 substantially addressed, though the performance evaluation and continuous improvement cycles required by Clauses 9 and 10 are still maturing.


Level 4: Managed — Quantitative Risk Management

Level 4 organizations have internalized the governance framework established at Level 3 and layered quantitative measurement on top of it. The defining characteristic of Level 4 is that governance decisions are driven by data rather than by judgment or policy alone. Risk is quantified, controls are measured for effectiveness, and organizational investments in governance capability are prioritized based on metric-driven evidence of where risk exposure is greatest and where controls are underperforming. At Level 4, the governance program generates actionable risk intelligence as a regular operational output rather than as an occasional audit artifact.

Agent identity governance at Level 4 is fully automated and continuously verified. Identity lifecycle management operates through an orchestrated system that provisions agents at deployment using policy-as-code, rotates credentials on automated schedules, detects and flags unregistered agents through continuous discovery scanning, and deprovisions agents immediately upon retirement trigger. Privilege usage is continuously monitored against provisioned scope; unused permissions are automatically flagged for review and revocation on a defined schedule. The organization tracks agent identity metrics as key performance indicators: number of agents with over-provisioned access, mean time to revoke access for retired agents, percentage of agents meeting identity policy requirements without exception. These metrics feed into regular governance reviews and drive specific remediation actions.

Runtime behavioral controls at Level 4 extend to predictive anomaly detection. In addition to threshold-based alerts, the organization applies machine learning–based behavioral analytics to agent activity streams, using trained models to identify deviations that do not exceed simple thresholds but that represent statistically significant departures from historical behavioral profiles. Security analytics teams treat agent behavioral data as a first-class telemetry stream with dedicated dashboards, SLA-bound alert response targets, and regular model retraining cycles. Tool and capability governance at Level 4 includes automated capability surface tracking that generates alerts when any change to the tool ecosystem expands an agent’s effective access — including changes to APIs that agents call, which may expand the data or actions accessible through those APIs without any change to the agent’s registered tool list.

Human oversight at Level 4 is continuously calibrated against outcomes. The organization tracks the decision quality of human overseers relative to the decisions made autonomously by agents, using outcome data to inform whether oversight thresholds are appropriately set. Oversight mechanisms that generate high volumes of alerts with low actionability rates are tuned or redesigned. The organization has established and monitors a Human Oversight Coverage Rate — the fraction of agent actions above a risk threshold that received human review — and treats meaningful deviations from target coverage as a governance incident requiring investigation. For the highest-risk agentic deployments, the organization applies formal assurance testing: regular exercises in which red-team scenarios are run against production agents to verify that oversight mechanisms perform as designed under adversarial conditions.

Incident response at Level 4 is proactive and continuously tested. Incident response playbooks for agentic AI scenarios are exercised through tabletop simulations at minimum quarterly, with full technical exercises for highest-priority scenarios. Mean time to detect (MTTD) and mean time to respond (MTTR) for agent-related incidents are tracked as program KPIs, with improvement targets set annually. The organization contributes threat intelligence about observed agent-related attack techniques and anomalous behaviors to industry sharing bodies, and it systematically monitors intelligence received from those bodies for applicability to its own agent deployment profile.

Compliance posture at Level 4 reflects a mature, evidence-based compliance program. STAR for AI Level 2 certification has been achieved, combining the AI-CAIQ self-assessment with ISO/IEC 42001 third-party certification. The organization conducts annual gap assessments against the full AICM control set, tracks gap closure rates as a program metric, and has established a process for evaluating proposed updates to AICM and other relevant frameworks before they are formally released. Board-level reporting on agentic AI risk is provided at a defined cadence, using quantitative metrics rather than qualitative status summaries.


Level 5: Optimizing — Continuous Improvement and Predictive Governance

Level 5 represents the frontier of organizational governance capability. Level 5 organizations have not only implemented and measured a comprehensive governance framework but have institutionalized the mechanisms for continuously improving it in response to new threats, changing deployment patterns, and evolving organizational risk appetite. The defining characteristic of Level 5 is that the governance program improves itself faster than the threat environment evolves — a proactive posture that prevents governance debt from accumulating as agentic AI deployments expand and diversify.

At Level 5, governance capabilities extend beyond controls and measurement to include predictive risk management. The organization uses historical incident data, behavioral analytics, and threat intelligence feeds to model the probability and impact of future agent-related risk events before they occur. These predictive models inform not only technical controls but organizational decisions about which agent deployment types to authorize, what capability limits to impose on newly deployed agent classes, and where to concentrate human oversight resources. Risk forecasts are produced at a defined cadence and reviewed by governance leadership as part of the organization’s strategic risk planning process.

Agent identity governance at Level 5 is adaptive. Rather than applying static provisioning policies, the organization maintains dynamic trust scoring for individual agents based on behavioral history, credential hygiene, anomaly incidence, and capability usage patterns. Agents with high trust scores and clean behavioral histories may qualify for streamlined credential renewal processes; agents exhibiting behavioral anomalies are automatically placed in restricted operating modes pending investigation. This dynamic trust model is operationalized through automated decision logic that integrates agent behavioral data with the identity management system, creating a closed-loop governance mechanism that adjusts access posture in real time based on observed evidence.

Runtime controls at Level 5 include continuous automated red-teaming. Adversarial test scenarios derived from MITRE ATLAS techniques and OWASP ASI risk categories are executed automatically against production agents at regular intervals, testing the resilience of behavioral guardrails, input validation, and oversight mechanisms without requiring manual exercise scheduling. Results feed directly into control effectiveness metrics, and control updates are automatically generated and staged for review when test outcomes indicate degraded defensive effectiveness. The organization contributes to the public knowledge base on agentic AI threats through disclosure of attack techniques observed against its systems, participation in CSA working groups, and publication of governance innovations that may benefit the broader community.

Workforce capability at Level 5 is embedded in organizational culture rather than maintained through training programs alone. Agentic AI governance competency is integrated into performance evaluation frameworks for relevant roles, career development paths for AI security and governance practitioners are formalized, and the organization invests in original research on emerging agentic AI risk categories through partnerships with academic institutions and standards bodies. The organization treats governance capability as a competitive differentiator and manages its investment in workforce development accordingly.

The transition from Level 4 to Level 5 is not a discrete event but an ongoing commitment. Level 5 organizations do not graduate from the maturity model; they remain at Level 5 through active maintenance of the capabilities that define it. The risk of regression — through organizational restructuring, rapid expansion of agentic deployments that outpaces governance scaling, or complacency during quiet periods — is real, and Level 5 organizations explicitly manage regression risk as part of their governance programs.


3. Assessment Dimensions

The AGMM evaluates governance capability across seven dimensions that collectively span the full lifecycle of agentic AI governance. For each dimension, the following table describes what maturity looks like at each of the five levels.

Dimension 1: Agent Identity Governance

Agent identity governance addresses how organizations manage the identities, credentials, and access privileges of autonomous agents throughout their operational lifecycle, from provisioning through deprovisioning.

Level Agent Identity Governance Characteristics
1 — Ad-Hoc Agents operate under human user credentials or shared service accounts. No dedicated agent identity infrastructure exists. Access scopes are not formally documented. No agent inventory maintained.
2 — Developing Most production agents have dedicated service accounts. Agent inventory maintained through periodic reviews. Minimum-privilege policy documented but manually enforced. Legacy shared-credential deployments persist. Credential rotation not systematized.
3 — Defined Every agent has a dedicated, auditable identity in a maintained registry. Lifecycle management is documented and consistently followed. Credential rotation automated on defined schedules. Privilege scoping enforced at the platform level with alerts for scope-excess attempts. Sub-agent spawning governed by explicit inheritance policy.
4 — Managed Full lifecycle automation through policy-as-code. Continuous discovery scanning detects unregistered agents. Unused permissions automatically flagged and revoked. Identity metrics tracked as KPIs (over-provisioning rate, mean time to revoke, policy compliance rate). Board-level visibility into agent identity posture.
5 — Optimizing Dynamic trust scoring adjusts agent access posture in real time based on behavioral evidence. Predictive models forecast identity-related risk events. Automated restricted-mode enforcement for anomalous agents. Governance innovations shared with the community and reflected in standards contributions.

Dimension 2: Runtime Behavioral Controls

Runtime behavioral controls address the mechanisms organizations use to detect, constrain, and respond to agent behavior during operation — including anomaly detection, guardrail implementation, and behavioral monitoring.

Level Runtime Behavioral Controls Characteristics
1 — Ad-Hoc General-purpose logging applied to agents without agent-specific analysis. No behavioral baselines. Anomalies detected only through user reports or downstream effects. No guardrail architecture.
2 — Developing Basic threshold alerting on agent activity metrics (call volumes, error rates). Manual log review for high-risk agents. Some input validation and rate limiting implemented ad hoc. No systematic behavioral baseline process.
3 — Defined Behavioral baselines established for all production agents. Deviation alerts routed through defined triage process. Guardrails (input validation, output filtering, session boundaries, rate limiting) implemented as standard deployment architecture layer. Tool invocation sequences and data access patterns monitored against baselines.
4 — Managed ML-based behavioral analytics applied to agent activity streams. Dedicated dashboards with SLA-bound response targets. Tool ecosystem changes that expand effective agent access surface tracked automatically. Behavioral metrics tracked as security KPIs with improvement targets. Regular model retraining cycles.
5 — Optimizing Continuous automated red-teaming derived from MITRE ATLAS and OWASP ASI. Control effectiveness measured against adversarial test outcomes. Adaptive guardrails that adjust in response to observed attack patterns. Original research contributions on novel agent behavioral threat categories.

Dimension 3: Tool and Capability Management

Tool and capability management addresses how organizations govern the set of tools, APIs, and external services that agents can invoke, including authorization, change control, and dynamic capability extension.

Level Tool and Capability Management Characteristics
1 — Ad-Hoc Tool access determined by what credentials permit rather than explicit policy. No capability register. No review process for tool additions. No governance of MCP server usage or third-party tool integrations.
2 — Developing Basic documentation of tools available to major agent deployments. New tool integrations require informal approval. MCP server and third-party API usage informally tracked. No automated enforcement of tool access boundaries.
3 — Defined Capability register maintained for all production agents, documenting every accessible tool and API. Tool access changes require change management review evaluating security implications. Policies govern third-party tool integration, MCP server authorization, and dynamic capability extension at runtime.
4 — Managed Automated tracking detects changes to API capabilities that expand effective tool access without formal change requests. Tool risk scoring applied to capability register entries. Tool access metrics tracked as KPIs. Capability surface growth rate monitored as a risk indicator.
5 — Optimizing Predictive capability risk modeling forecasts risk implications of planned tool ecosystem changes before deployment. Automated tool access governance integrates with CI/CD pipeline to enforce capability policies at build time. Contributions to community standards on tool access governance.

Dimension 4: Human Oversight Mechanisms

Human oversight mechanisms address how organizations ensure meaningful human control over consequential agent decisions and actions, including oversight threshold policies, escalation workflows, and oversight effectiveness measurement.

Level Human Oversight Mechanisms Characteristics
1 — Ad-Hoc No formal human-in-the-loop policy. Individual operators apply ad hoc judgment about when to review agent actions. No oversight escalation workflow. High-consequence actions may execute without human review.
2 — Developing Informal norms about when human review is expected. Some high-risk action categories routed for approval, based on team-level guidelines rather than enterprise policy. Oversight coverage not measured.
3 — Defined Documented risk-differentiated oversight threshold policy. High-risk action categories defined and routed through formal approval workflows with documented escalation paths. Human Oversight Coverage Rate tracked as a governance metric. Oversight policy reviewed and updated on a defined schedule.
4 — Managed Oversight calibration driven by outcome data — agent decision quality versus overseer decision quality tracked over time. Oversight mechanisms generating low-signal alerts redesigned. Red-team exercises verify that oversight mechanisms perform under adversarial conditions. Oversight Coverage Rate maintained at defined target with deviations treated as governance incidents.
5 — Optimizing Predictive risk models inform oversight resource allocation. Human oversight thresholds adjusted dynamically based on behavioral risk indicators. Oversight effectiveness contributes to broader industry research on human-AI teaming in high-stakes contexts.

Dimension 5: Incident Response Readiness

Incident response readiness addresses whether organizations have developed the specific capabilities required to detect, contain, investigate, and recover from agentic AI security incidents.

Level Incident Response Readiness Characteristics
1 — Ad-Hoc AI-related incidents handled under general IT incident response procedures with no agent-specific adaptations. Causal factors specific to agentic architecture not systematically captured in postmortems. No threat intelligence monitoring for agentic attack techniques.
2 — Developing AI incidents recognized as a distinct category in the incident management system. Some agent-specific scenarios included in IR planning discussions. Post-incident reviews conducted but not systematically producing governance improvements. Basic monitoring of OWASP and MITRE ATLAS publications for emerging threats.
3 — Defined Agent-specific IR playbooks developed for key scenario categories (prompt injection, credential theft, sub-agent anomaly, cascading failure). Playbooks reviewed and updated at defined intervals. MTTD and MTTR for agent incidents tracked. Post-incident review process produces formal recommendations tracked through to governance updates. Participation in AI threat intelligence sharing communities.
4 — Managed IR playbooks exercised through quarterly tabletop simulations and annual technical exercises. MTTD/MTTR tracked as program KPIs with improvement targets. Intelligence contributions made to sharing communities as well as consumed from them. IR effectiveness assessed through simulation outcomes and used to drive playbook updates.
5 — Optimizing Continuous IR capability testing through automated adversarial simulations. Predictive modeling identifies which IR scenarios are most likely to be triggered based on current threat intelligence. Novel attack technique disclosures contributed to public knowledge bases.

Dimension 6: Compliance Posture

Compliance posture addresses the organization’s relationship to external standards, regulatory requirements, and certification programs relevant to agentic AI governance.

Level Compliance Posture Characteristics
1 — Ad-Hoc No formal mapping to AICM, NIST AI RMF, ISO 42001, or other relevant frameworks. Compliance activities, if any, apply only general IT compliance language to AI systems. No STAR for AI participation.
2 — Developing High-level review of relevant frameworks completed. Significant gaps acknowledged but not formally tracked or remediated on committed timelines. STAR for AI Level 1 aspirational. AI-CAIQ self-assessment in preparation or early draft stage. ISO 42001 awareness but no formal implementation underway.
3 — Defined Formal AICM self-assessment completed with gaps mapped and remediation tracked. STAR for AI Level 1 achieved. ISO 42001 implementation underway through Clause 8. Regulatory requirements (EU AI Act, applicable sector-specific requirements) mapped to internal controls. Compliance status reported to governance committee quarterly.
4 — Managed STAR for AI Level 2 achieved (ISO 42001 certified plus Valid-AI-ted AI-CAIQ). AICM coverage tracked as a dashboard metric with annual gap assessments. Framework change monitoring process established for AICM, ISO 42001, and NIST AI RMF updates. Quantitative compliance metrics reported to board.
5 — Optimizing Contributions to framework development through CSA working groups, standards bodies, and public comment processes. Predictive compliance gap analysis identifies likely exposure to forthcoming regulatory requirements. Continuous certification readiness maintained as an operational state rather than a periodic event.

Dimension 7: Workforce Capability

Workforce capability addresses whether the organization has the human knowledge, skills, and cultural understanding required to govern agentic AI systems effectively.

Level Workforce Capability Characteristics
1 — Ad-Hoc No structured training on agentic AI risks. Security teams lack familiarity with agent-specific threat categories. Governance responsibilities concentrated in a small number of individuals with informal expertise. No succession planning for agentic AI governance roles.
2 — Developing Basic awareness training on AI security risks delivered to security and AI operations teams. Some individuals with specialized knowledge identified as internal resources. Governance role definitions emerging but not formally established. External expertise engaged on a project basis.
3 — Defined Role-based training program covering agentic AI security and governance delivered to all relevant roles (security engineers, AI operations, compliance, legal, product management). Governance roles formally defined with documented responsibilities. Certification or competency verification requirements established for key governance roles. Training program reviewed and updated annually.
4 — Managed Training effectiveness assessed through competency testing and applied capability demonstration. Skills gap analysis conducted at defined intervals and used to drive training investment priorities. Workforce capability metrics tracked as program KPIs. External benchmark comparisons used to calibrate training content.
5 — Optimizing Agentic AI governance competency embedded in performance evaluation frameworks. Career development paths formalized for AI security and governance practitioners. Original research contributions. Investment in academic and standards partnerships. Governance capability treated as strategic competitive differentiator.

4. Self-Assessment Questionnaire

The AGMM self-assessment is designed to produce a scored maturity profile across all seven dimensions. Each dimension is assessed through three or four questions, yielding a response set that can be mapped to a level designation through the scoring guidance that follows. The assessment is most valuable when completed collaboratively across stakeholder roles — security engineering, AI operations, compliance, legal, and executive governance owners — rather than by a single individual, as the distributed perspective tends to surface gaps that any single vantage point would miss.

Before completing the questionnaire, organizations should assemble basic reference materials: an agent inventory (if one exists), any AI governance policy documents, results of prior security assessments or AICM self-assessments, and access to the teams responsible for operating each major agentic deployment. The quality of the assessment depends on the quality of the evidence used to answer each question. Honest uncertainty is preferable to optimistic overestimation; the purpose of the assessment is to identify where to invest, not to generate a favorable score.

Scoring instructions: For each question, select the response that best describes the organization’s current practice. Assign a score of 1 to 5 corresponding to the level described (1 = Level 1/Ad-Hoc, 5 = Level 5/Optimizing). Sum the scores within each dimension to produce a dimension score, then calculate the average dimension score to produce the overall maturity level estimate. A dimension score should be treated as the assessed level for that dimension; the overall level should be understood as a weighted average, with the lowest dimension scores indicating where governance investment will produce the highest risk reduction.

# Question Level 1 Response (Score 1) Level 3 Response (Score 3) Level 5 Response (Score 5)
Agent Identity Governance
1 How are agents identified and authenticated to systems they access? Agents use human user credentials or shared service accounts Every production agent has a dedicated service identity; lifecycle is documented and followed Dynamic trust scoring adjusts agent access posture in real time based on behavioral evidence
2 How is agent access privilege scoped and enforced? Access is whatever the credential permits; no formal scoping policy Minimum-privilege policy enforced at the platform level; attempts to exceed scope generate alerts Unused permissions are automatically revoked; dynamic privilege adjustment based on behavioral signals
3 How is the agent inventory maintained and verified? No agent inventory exists, or it is incomplete and unmaintained Real-time or near-real-time inventory reconciled against the identity system on a defined schedule Continuous automated discovery with immediate alerting on unregistered agent detection
Runtime Behavioral Controls
4 What behavioral monitoring exists for agents during operation? General-purpose logging; anomalies detected only through user reports Behavioral baselines established; deviation alerts routed through defined triage process Continuous automated red-teaming; adaptive guardrails that update in response to observed attack patterns
5 How are guardrails (input validation, output filtering, rate limiting) implemented? Ad hoc per deployment, if at all Standard deployment architecture layer applied consistently to all production agents Automated guardrail effectiveness testing; predictive adjustment based on threat intelligence
6 How are tool invocation patterns and data access volumes monitored? Not monitored or monitored only through general system logs Monitored against established baselines; deviations generate alerts reviewed by security team ML-based behavioral analytics with dedicated dashboards and SLA-bound response targets
Tool and Capability Management
7 How is the set of tools and APIs available to each agent governed? No formal capability register; agents access whatever their credentials permit Capability register maintained; tool access changes require change management review Automated capability surface tracking; tool risk scoring; predictive risk modeling for planned changes
8 How are third-party tool integrations and MCP server connections authorized? Informally, or not at all Explicit authorization policy; third-party integrations require security review before deployment Automated policy enforcement at CI/CD build time; continuous capability surface monitoring
Human Oversight Mechanisms
9 What policy governs which agent actions require human review before execution? No formal policy; operators use individual judgment Documented risk-differentiated threshold policy; high-risk categories routed through formal approval workflows Dynamically calibrated oversight thresholds driven by behavioral risk indicators and outcome data
10 How is oversight coverage measured and enforced? Not measured Human Oversight Coverage Rate tracked as a governance metric; deviations investigated Predictive resource allocation; oversight effectiveness contributes to published industry research
11 How are oversight mechanisms validated against adversarial scenarios? Not validated Defined process for reviewing and updating oversight policies; oversight failures captured in IR reviews Continuous automated adversarial testing of oversight mechanisms; results drive real-time control adjustments
Incident Response Readiness
12 Do incident response playbooks address agent-specific scenarios? No; general IT IR procedures applied to all AI incidents Yes; playbooks for major agent incident categories reviewed and updated on defined schedule Automated adversarial simulations continuously test playbook effectiveness
13 How are post-incident reviews used to improve agent governance? Lessons informally noted; governance rarely updated as a result Formal recommendations produced and tracked through to policy or control updates Predictive modeling identifies likely future IR scenarios based on current threat intelligence
14 Does the organization participate in agentic AI threat intelligence sharing? No Consumes threat intelligence from relevant communities (OWASP, MITRE ATLAS) Both contributes to and consumes from sharing communities; discloses novel attack techniques observed
Compliance Posture
15 What is the organization’s relationship to the AICM? No formal AICM assessment completed Formal self-assessment completed; gaps tracked with remediation timelines Contributions to AICM development through CSA working groups; continuous certification readiness
16 What is the organization’s STAR for AI status? No participation STAR for AI Level 1 achieved or in final preparation STAR for AI Level 2 achieved; Level 2 maintained as an ongoing operational state
17 How does the organization track and manage compliance with ISO 42001? No formal ISO 42001 engagement Implementation underway through Clause 8; performance evaluation cycle maturing Certified; ongoing; contributes to standard updates through ISO participation
Workforce Capability
18 What agentic AI security and governance training exists for relevant staff? No structured training; individual knowledge informal and uneven Role-based training covering agentic risks delivered to all relevant roles; updated annually Agentic AI governance competency embedded in performance evaluation; career paths formalized
19 How are governance roles and responsibilities defined and maintained? No formal role definitions for agentic AI governance Governance roles formally defined with documented responsibilities and succession plans Governance capability treated as strategic competitive differentiator; external benchmark comparisons
20 How does the organization develop and retain specialized expertise? Informally; dependent on individual initiative External certification or competency verification required for key roles; training effectiveness assessed Academic and standards partnerships; original research contributions; recognized external expertise
Organizational Governance Structures
21 Does a formal governance body (committee, role, or policy owner) have explicit accountability for agentic AI governance? No; accountability diffuse or absent Yes; governance committee meets on defined schedule; reports to senior leadership; owns the governance framework Governance accountability linked to quantitative risk metrics; board-level visibility with predictive risk forecasting
22 How is the governance framework kept current with the evolving agentic AI threat landscape? It is not; updates occur only when a problem surfaces Framework reviewed on a defined schedule; external framework changes assessed for applicability Continuous framework monitoring; predictive compliance gap analysis; governance program improves faster than threats evolve

5. Progression Roadmap

The transitions between AGMM levels are not automatic consequences of time or effort — they require deliberate investment in specific capabilities and, in several cases, significant cultural change. This section describes the characteristic investments required for each transition, the enabling conditions that make the transition achievable, and the indicators that signal a transition has been completed.

Level 1 to Level 2: Establishing Visibility and Accountability

The Level 1-to-2 transition is fundamentally about making the invisible visible. Organizations at Level 1 typically do not know what agents they have, who is responsible for them, or what systems those agents can access. The first investment required is agent discovery: a deliberate enumeration of all agentic AI systems currently in operation, regardless of formality or scale. This includes shadow deployments in business units that operate outside central IT oversight, experimental agents that have quietly moved to production, and automation scripts or LLM-based workflows that have agent-like properties without being classified as AI systems.

The second investment is ownership assignment. Every discovered agent needs an identified owner — a person or team accountable for its governance — and an identified risk profile. This ownership structure is the prerequisite for every subsequent governance capability. The third investment is a basic written policy: a document that articulates minimum expectations for agent deployment, defines what kinds of deployments require security review, and establishes that agents should operate under dedicated credentials rather than human user accounts. This policy need not be comprehensive; it needs to be real, communicated, and enforced for new deployments going forward.

Organizations should expect the Level 1-to-2 transition to take two to four months for a mid-sized enterprise. The primary risk is discovery fatigue: the agent enumeration process often reveals more deployments than anticipated, and organizations may struggle to complete it without executive sponsorship that communicates the importance of the exercise. A successful transition is evidenced by a maintained agent inventory, at least one named governance owner, and a published (even if incomplete) agent governance policy.

Level 2 to Level 3: Building Systematic Controls

The Level 2-to-3 transition is the most technically demanding and organizationally significant in the model. It requires moving from reactive, manually enforced governance to proactive, systematically applied governance. Three capability investments are essential. The first is agent identity infrastructure: automated provisioning, rotation, and deprovisioning of agent credentials through an identity system that treats agents as first-class principals. This typically requires integration between the organization’s identity and access management platform and its AI deployment pipeline, and it may require procuring or building tooling that does not exist in the current technology stack.

The second capability investment is behavioral monitoring. Organizations need to establish behavioral baselines for production agents and deploy monitoring that generates actionable alerts when agents deviate from those baselines. This requires telemetry collection from agent runtimes, a data store for behavioral history, and alerting logic calibrated to generate meaningful signals without alert fatigue. The third investment is human oversight formalization: developing and deploying the risk-differentiated threshold policy, building the approval workflows for high-risk actions, and establishing the oversight metrics that will be tracked going forward.

The cultural change required for this transition should not be underestimated. Product and engineering teams accustomed to deploying agents quickly and iterating freely will encounter new governance requirements — mandatory security reviews, capability register updates, credential lifecycle processes — that add friction to their workflows. Executive sponsorship is essential to communicate that this friction is the cost of operating at scale with systems that carry real organizational risk. Organizations should expect this transition to take six to twelve months for a typical enterprise, with compliance automation tooling investments of meaningful scale. The transition is complete when an independently verified audit confirms consistent application of the Level 3 control set across all production agent deployments.

Level 3 to Level 4: Instrumentation and Quantification

The Level 3-to-4 transition requires transforming governance from a process discipline into a measurement system. Organizations at the upper end of Level 3 have the controls in place; what they lack is the instrumentation to know whether those controls are performing as intended, where they are underperforming, and where risk exposure is greatest. The primary investment for this transition is a governance metrics program: defining the key performance indicators, building the data collection and reporting infrastructure to populate them, and establishing the governance review processes that use metrics to drive decisions.

The analytics investments required for Level 4 are substantial. ML-based behavioral analytics for agent activity streams requires training data, model development infrastructure, and ongoing model maintenance. Automated capability surface tracking requires integration across multiple systems and ongoing monitoring operations. Red-team exercises require either internal adversarial testing capability or external red-team partners with agentic AI expertise. Organizations at Level 3 planning this transition should allocate 12 to 18 months for full capability development, with interim milestones at the six- and twelve-month marks to validate progress. The transition is evidenced by the existence of a governance dashboard with live metrics, a documented measurement program with defined targets and review cadences, and achievement of STAR for AI Level 2 certification.

Level 4 to Level 5: Institutionalizing Continuous Improvement

The Level 4-to-5 transition is less a technical challenge than a cultural and strategic one. Organizations at Level 4 already have sophisticated governance capability; what differentiates Level 5 is the institutionalization of mechanisms that make the governance program self-improving rather than dependent on periodic planned investments. This requires embedding governance innovation into the organization’s operating model: dedicating staff time to original research on emerging agentic AI risks, establishing academic and standards body partnerships, creating processes for systematic knowledge contribution to the broader community, and linking governance competency to performance evaluation and career progression for relevant roles.

The dynamic trust scoring and predictive risk modeling capabilities that characterize Level 5 identity and behavioral governance require significant data science investment and ongoing model development operations. Organizations considering this transition should evaluate whether the risk profile of their agentic AI deployments justifies the investment, or whether a high-functioning Level 4 program is the appropriate target state for their organization. Not every enterprise needs to operate at Level 5; the model is descriptive of the frontier, not prescriptive for all organizations. What Level 5 does require, for those who pursue it, is a sustained commitment to governance as a strategic capability rather than a compliance cost.


6. Enterprise Adoption Guidelines

Governance requirements for agentic AI deployments vary significantly across deployment types. This section describes the governance considerations specific to four major deployment categories, with guidance on which AGMM dimensions and level targets are most relevant to each.

6.1 Internal Productivity Agents

Internal productivity agents — AI systems that assist employees with tasks such as document drafting, code generation, meeting summarization, knowledge retrieval, and workflow automation — represent the most common initial agentic deployment type. They typically operate within organizational boundaries, accessing internal systems such as email, calendar, document repositories, project management tools, and code bases. Their failure modes are primarily information disclosure (exposure of sensitive internal data through excessive tool access or prompt injection), action errors (incorrect or unauthorized modifications to documents or workflows), and behavioral drift (agents that expand their operational scope beyond their original mandate as their goal decomposition logic evolves).

For internal productivity agents, governance priority should be placed on agent identity governance (ensuring that individual agents are not able to access data or systems irrelevant to their stated function), tool and capability management (controlling which internal systems each agent class can reach and auditing that access regularly), and human oversight for high-stakes actions (ensuring that consequential automations — sending communications, committing code, modifying financial records — require explicit human approval). Organizations deploying internal productivity agents at scale should target Level 3 governance maturity across these three dimensions before expanding deployment. The human oversight mechanism is particularly important in this category, as the familiarity and apparent benignness of productivity tools can lead organizations to underestimate the risk of agents operating with excessive autonomy over communications and documents.

6.2 Customer-Facing Agents

Customer-facing agents — AI systems that interact directly with external customers in the context of support, sales, onboarding, advisory, or transactional services — carry governance requirements that extend beyond the organization’s internal risk posture to include customer trust, regulatory compliance, and reputational risk. These agents operate in an adversarial environment: customers may deliberately attempt to manipulate agent behavior through prompt injection, social engineering, or exploitation of agent logic to obtain unauthorized access, commitments, or information. The failure consequences extend to legal liability, regulatory enforcement, and irreversible customer trust damage.

Governance for customer-facing agents should prioritize runtime behavioral controls (guardrails that prevent agents from being manipulated into unauthorized disclosures or commitments), human oversight mechanisms (clear escalation paths to human agents for sensitive or high-stakes interactions, with appropriate logging of escalation criteria), and compliance posture (alignment with applicable regulatory requirements for automated decision-making, including EU AI Act Article 50 transparency obligations for AI-human interactions). Organizations deploying customer-facing agents in regulated industries should target Level 3 or above across all seven AGMM dimensions and should treat STAR for AI Level 1 as a minimum credibility threshold for customer-facing deployments. Level 2 certification should be pursued as a risk management measure for high-stakes customer-facing deployments in sectors such as financial services, healthcare, and legal services.

6.3 Operational Automation Agents

Operational automation agents — AI systems that execute business processes autonomously across connected enterprise systems, such as supply chain management, financial operations, infrastructure management, or security orchestration — carry the highest inherent risk profile of the four deployment types. These agents take consequential actions with direct financial, operational, or security implications. Their tool access spans core enterprise systems. Their failure modes include autonomous actions that are expensive or impossible to reverse, cascading errors that propagate across connected systems, and exploitation by adversaries who use the agent’s privileged access as a pivot point for lateral movement or data exfiltration.

Organizations deploying operational automation agents should set Level 4 governance maturity as the minimum target before production deployment at scale. The most critical governance capabilities for this deployment type are agent identity governance (agents in this category should operate under the most tightly scoped, continuously monitored identities in the enterprise), human oversight mechanisms (every irreversible or high-impact action should require explicit human approval, regardless of operational efficiency cost), and incident response readiness (agents with broad system access that become compromised present incident response challenges of a different order than conventional system compromises). The organization should treat a compromise of an operational automation agent as a potential maximum-severity incident and should plan and exercise accordingly.

6.4 Multi-Organization Agent Ecosystems

Multi-organization agent ecosystems — deployments in which agents from multiple organizations interact, delegate tasks to each other, share data, and cooperate within networked workflows — present governance challenges that no single organization can address unilaterally. These ecosystems are characteristic of supply chain automation, financial market infrastructure, healthcare data exchange, and agentic AI marketplaces. The governance challenges include inter-agent trust establishment (how does an agent in Organization A verify the identity and integrity of an agent from Organization B before accepting a task or sharing data?), liability attribution across organizational boundaries (when a multi-agent workflow produces a harmful outcome, which organization’s agent is responsible?), and cascading failure management (a governance failure in one organization’s agent can propagate across the ecosystem before any individual organization detects it).

Organizations participating in multi-organization agent ecosystems should recognize that their internal governance maturity is a necessary but not sufficient condition for ecosystem security. The ecosystem as a whole requires shared governance standards, interoperable identity attestation mechanisms, and agreed-upon protocols for anomaly reporting and incident coordination across organizational boundaries. The CSA STAR for AI program, the AIUC-1 compliance standard, and the emerging Agentic Trust Framework provide candidate foundations for multi-organization governance alignment, but organizations should not assume that counterparty organizations have matched their governance maturity. Due diligence on counterparty governance posture — including review of STAR for AI registration status and AI-CAIQ submissions — should be incorporated into third-party risk management programs as a precondition for production integration with external agents.


7. Relationship to Other Frameworks

The AGMM does not replace any existing governance framework; it provides a maturity navigation layer on top of them. Understanding how the AGMM relates to CMMI, NIST CSF 2.0, STAR for AI, and ISO 42001 allows organizations to use AGMM maturity assessments to inform their adoption of and progress toward these frameworks.

CMMI’s five-level structure is the direct architectural antecedent of the AGMM [8]. The level correspondence is close but not identical: CMMI’s Level 2 (Managed) corresponds more closely to AGMM Level 3 (Defined) in terms of the process discipline it requires, because CMMI Managed denotes project-level management processes that are planned and tracked, which in the AGMM context corresponds to a defined governance framework consistently applied. The AGMM’s explicit Level 2 (Developing) captures the intermediate state — policies exist but enforcement is inconsistent — that CMMI’s model does not distinguish as clearly. Organizations familiar with CMMI will find the AGMM’s levels somewhat more granular in the middle range, which reflects the empirical reality that most current agentic AI governance programs are navigating the Level 2-to-3 transition rather than the Level 4-to-5 transition.

NIST CSF 2.0’s four implementation tiers — Partial (Tier 1), Risk Informed (Tier 2), Repeatable (Tier 3), and Adaptive (Tier 4) — map naturally to AGMM Levels 1, 2, 3-4, and 5 respectively [4]. The CSF explicitly notes that tiers are not a maturity model to climb sequentially; they describe the degree of rigor an organization applies to cybersecurity risk management. The AGMM takes a more prescriptive position appropriate to the specific context of agentic AI governance, where the absence of defined progression pathways is itself a risk. The NIST CSF Govern function, added in CSF 2.0 and described as the central coordinating function that informs all other risk management activities, has direct analogs in AGMM’s agent identity governance, compliance posture, and organizational governance structure dimensions.

ISO/IEC 42001 provides the most structurally complete external framework against which to map AGMM levels [6]. Its ten clauses — from Context (Clause 4) and Leadership (Clause 5) through Operations (Clause 8), Performance Evaluation (Clause 9), and Improvement (Clause 10) — correspond to organizational capabilities that appear at different AGMM levels. Clause 4 context establishment and Clause 5 leadership commitment are foundational requirements that a Level 2 organization is beginning to address. Clauses 6 through 8 — planning, support, and operations — are substantially addressed by Level 3. Clauses 9 (Performance Evaluation) and 10 (Improvement) correspond to the measurement and continuous improvement capabilities of Levels 4 and 5. ISO 42001 certification is therefore most naturally aligned with Level 4 governance in the AGMM, consistent with STAR for AI Level 2’s requirement that both certification and third-party audit of the AI-CAIQ be completed.

The CSA STAR for AI program provides the most operationally direct mapping to AGMM levels [3]. STAR Level 1, which requires submission of a self-assessed AI-CAIQ, is achievable at AGMM Level 3 when the organization has completed a systematic AICM assessment and can provide evidenced responses to AI-CAIQ questions. STAR Level 2, which requires both ISO/IEC 42001 certification and a Valid-AI-ted AI-CAIQ assessment by a CSA-approved assessor, requires the measurement capabilities and formal compliance program of AGMM Level 4. Organizations using the AGMM as a planning tool can treat STAR Level 1 as an interim milestone within the Level 3 development arc and STAR Level 2 as the destination marker for Level 4 completion.


8. Framework Alignment Table

The following table maps each AGMM maturity level to the corresponding AICM control coverage profile, STAR for AI level, NIST CSF 2.0 tier, and ISO 42001 clause alignment. This table is intended as a planning reference for organizations using multiple frameworks simultaneously, not as a precise technical equivalence mapping.

AGMM Level AICM Control Coverage STAR for AI Level NIST CSF 2.0 Tier ISO 42001 Clause Alignment
Level 1 — Ad-Hoc Minimal coverage; general cloud controls from CCM heritage may apply but AI-specific controls across AA, MS, and AIS domains are largely unaddressed. Significant gaps across all 18 domains. Not applicable; no AI-CAIQ submission. Tier 1 — Partial: security activity ad hoc; inconsistent across the organization. No formal AIMS. Clause 4 (Context) not formally addressed. Leadership commitment (Clause 5) absent for AI governance specifically.
Level 2 — Developing GRC, IAM, and LOG domains partially addressed through general IT governance. AICM AA and MS domains acknowledged but with limited formal control implementation. Estimated coverage: 20–35% of relevant control objectives. AI-CAIQ draft in preparation; not submitted. No STAR registration. Tier 2 — Risk Informed: risk-informed practices exist but vary across business units; not enterprise-wide. Clause 4 context assessment begun; Clause 5 leadership engagement emerging. Clause 6 (Planning) in early stages. No formal AIMS established.
Level 3 — Defined Systematic self-assessment completed across all 18 domains. Priority domains (AA, IAM, MS, LOG, GRC) substantially addressed. Estimated coverage: 55–70% of AICM control objectives. Active gap remediation tracked. STAR for AI Level 1 achieved: AI-CAIQ submitted to CSA registry. Self-assessment reflects evidenced responses. Tier 3 — Repeatable: standardized practices implemented enterprise-wide; consistent and repeatable governance processes. Clauses 4–8 substantially addressed. AI impact assessment (Clause 6) completed. AI lifecycle management (Clause 8) operational. Performance evaluation (Clause 9) maturing.
Level 4 — Managed Comprehensive coverage across all 18 domains with quantitative tracking of coverage rate and gap closure velocity. Estimated coverage: 80–90% of AICM control objectives. Annual formal reassessment. STAR for AI Level 2 achieved: ISO/IEC 42001 certification plus Valid-AI-ted AI-CAIQ by CSA-approved assessor. Tier 4 — Adaptive: proactive, continuously improving approach; cybersecurity risk management integrated into organizational strategy. Full ISO/IEC 42001 certification. All clauses implemented. Performance evaluation (Clause 9) generating actionable improvement data. Clause 10 (Improvement) operational with tracked corrective action cycles.
Level 5 — Optimizing Dynamic, predictive coverage management. Continuous assessment against AICM and emerging frameworks. Contributions to AICM development through CSA working groups. Coverage maintained at >90% with continuous monitoring. STAR for AI Level 2 maintained; active participation in CSA AI working groups contributing to program evolution. Tier 4 — Adaptive (advanced implementation): governance program adapts in advance of threat evolution; predictive risk management; community contributions. ISO/IEC 42001 certified with continuous improvement program. Active participation in ISO AI standards development. Clause 10 improvement cycles driven by quantitative evidence and predictive analytics.

References

[1] OWASP GenAI Security Project. OWASP Top 10 for Agentic Applications 2025. December 2025. https://owasp.org/www-project-top-10-for-large-language-model-applications/

[2] Cloud Security Alliance. The Agentic Trust Framework: Zero Trust Governance for AI Agents. February 2, 2026. https://cloudsecurityalliance.org/blog/2026/02/02/the-agentic-trust-framework-zero-trust-governance-for-ai-agents

[3] Cloud Security Alliance. STAR for AI. https://cloudsecurityalliance.org/star/ai/; Cloud Security Alliance Press Release. Cloud Security Alliance Announces Availability of STAR for AI Level 2 and Valid-AI-ted for AI. November 20, 2025. https://cloudsecurityalliance.org/press-releases/2025/11/20/cloud-security-alliance-announces-availability-of-star-for-ai-level-2-and-valid-ai-ted-for-ai

[4] National Institute of Standards and Technology. Cybersecurity Framework Version 2.0. NIST CSWP 29. February 2024. https://nvlpubs.nist.gov/nistpubs/CSWP/NIST.CSWP.29.pdf

[5] Cloud Security Alliance. AI Controls Matrix (AICM) v1.0. July 10, 2025. https://cloudsecurityalliance.org/artifacts/ai-controls-matrix; Cloud Security Alliance. Introducing the CSA AI Controls Matrix. July 10, 2025. https://cloudsecurityalliance.org/blog/2025/07/10/introducing-the-csa-ai-controls-matrix-a-comprehensive-framework-for-trustworthy-ai

[6] International Organization for Standardization. ISO/IEC 42001:2023 — Artificial Intelligence — Management System. 2023. https://www.iso.org/standard/42001

[7] Gartner. Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026. August 26, 2025. https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of-enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025

[8] CMMI Institute. CMMI Levels of Capability and Performance. https://cmmiinstitute.com/learning/appraisals/levels; Carnegie Mellon University Software Engineering Institute. Capability Maturity Model Integration (CMMI). https://www.sei.cmu.edu/our-work/cmmi/

[9] Cloud Security Alliance and Google Cloud. The State of AI Security and Governance. December 2025. https://cloudsecurityalliance.org/artifacts/the-state-of-ai-security-and-governance; Cloud Security Alliance Press Release. Governance Maturity Is Strongest Predictor of AI Readiness. December 18, 2025. https://cloudsecurityalliance.org/press-releases/2025/12/18/csa-and-google-cloud-study-finds-governance-maturity-is-strongest-predictor-of-ai-readiness

[10] Cloud Security Alliance. Autonomy Levels for Agentic AI. January 28, 2026. https://cloudsecurityalliance.org/blog/2026/01/28/levels-of-autonomy

[11] KPMG. AI Governance for the Agentic AI Era. 2025. https://kpmg.com/us/en/articles/2025/ai-governance-for-the-agentic-ai-era.html

[12] Cloud Security Alliance. AI Governance: A Maturity Multiplier. December 18, 2025. https://cloudsecurityalliance.org/blog/2025/12/18/ai-security-governance-your-maturity-multiplier

[13] Cloud Security Alliance. CSAI Foundation Launch: Securing the Agentic Control Plane. March 23, 2026. https://cloudsecurityalliance.org/press-releases/2026/03/23/csa-securing-the-agentic-control-plane

[14] Cloud Security Alliance. Introductory Guidance to AICM. https://cloudsecurityalliance.org/artifacts/introductory-guidance-to-aicm

[15] MITRE. ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems). https://atlas.mitre.org/