White Paper | 2026-03-27 | Status: draft

AI Risk Observatory Telemetry Architecture

Executive Summary

The agentic AI deployment wave of 2025–2026 has outpaced the security monitoring infrastructure that enterprises rely on to detect, investigate, and respond to incidents. Where a traditional application failure leaves a discrete error in a log file, an agentic system failure can manifest as a cascade of subtly incorrect tool invocations, a delegation chain that exceeded intended scope, or a behavioral drift that produces dangerous outcomes without triggering any single alerting threshold. The AI Incident Database recorded a surge in agentic-specific incidents through the end of 2025 and into early 2026, yet the security community has lacked a shared vocabulary—let alone a shared infrastructure—for collecting and analyzing the telemetry those incidents generate. [1]

The CSAI AI Risk Observatory is the Cloud Security Alliance’s response to this visibility gap. Its core premise is that agentic risk intelligence requires a dedicated telemetry architecture, not an extension of existing SIEM or APM tooling. Agentic events have structural properties—nested tool calls, probabilistic intent inference, session-scoped delegation chains, behavioral deviation from learned baselines—that do not map onto the flat event models of traditional security logging. Building the observatory on top of generic telemetry infrastructure would require so many field mappings, inference steps, and schema translations that the result would be more brittle than a purpose-built solution.

This whitepaper specifies the complete architecture of the CSAI AI Risk Observatory telemetry pipeline. It covers the five data source categories that feed the system, the standardized Agentic Telemetry Event (ATE) schema and its JSON representation, the AI-assisted processing layer responsible for incident clustering and trend forecasting, the output artifact taxonomy including dashboards, weekly digests, and automated alerts, and the Unified AI Risk Record (UAIR) identifier scheme that complements the CVE program for agentic-specific risk tracking. The document also addresses privacy and data minimization requirements and concludes with a full framework alignment table mapping observatory components to AICM LOG domain controls, NIST AI RMF functions, MAESTRO layers, and EU AI Act Article 12 obligations.

The target audience for this document is security architects designing observability programs for agentic AI deployments, CSAI working group members contributing to the observatory implementation, and enterprise practitioners evaluating how to integrate agentic telemetry into existing security operations.

1. Introduction: The Agentic Visibility Gap

Security operations teams spend a significant portion of their working hours managing data they did not ask for. Modern SIEM platforms ingest terabytes of network flows, authentication events, DNS queries, and endpoint telemetry per day, and the primary challenge is not data scarcity but signal extraction. Against this backdrop, the discovery that agentic AI systems introduce a genuine visibility gap may seem counterintuitive. The gap, however, is structural rather than volumetric: it arises from a mismatch between the event model that existing monitoring infrastructure assumes and the event model that agentic systems actually produce.

A conventional application is, from a monitoring perspective, a relatively well-bounded actor. Its interactions with external systems are mediated through a fixed set of API calls, database queries, and network connections, and its failure modes tend to produce discrete, attributable error signals. An agentic AI system operates on a fundamentally different model. It reasons across context windows that may span hours of session history, dispatches tool calls whose parameters are inferred from that context rather than hard-coded in application logic, delegates subtasks to other agents through chains that may be three or four hops long, and accumulates permissions through scope escalation patterns that are invisible to tools designed to monitor individual API calls in isolation. The risk events that matter most in an agentic deployment—prompt injection leading to unauthorized data exfiltration, a delegation chain that exceeds intended authorization scope, a behavioral drift that causes the agent to pursue a subtly misaligned objective—are not atomic events that trigger a threshold. They are patterns that emerge across a sequence of individually innocuous actions, visible only to an analysis layer that understands the agentic context within which those actions occurred. [2]

The research community has documented the scale of this gap with increasing precision. NIST’s Center for AI Standards and Innovation conducted three practitioner workshops in 2025 and an in-depth literature review, culminating in a March 2026 report on challenges to the monitoring of deployed AI systems. The report found that while post-deployment monitoring is widely acknowledged as critical to responsible AI adoption, most organizations lack the instrumentation, standards, and institutional processes to execute it effectively. [3] The OpenTelemetry community’s parallel effort to develop semantic conventions for agentic systems—specifying attributes for tasks, actions, agents, teams, artifacts, and memory—reflects the same recognition from the infrastructure side: the telemetry vocabulary needed to monitor agents does not yet exist as a stable standard. [4] The observatory architecture specified in this document is designed to be implementable with current tooling while remaining compatible with the emerging OpenTelemetry GenAI semantic conventions as they mature.

The MITRE ATLAS framework’s October 2025 expansion to include 14 new agentic and generative AI techniques provides the threat taxonomy that motivates the observatory’s design priorities. Techniques including AML.T0096 (AI Service API Abuse), AML.T0098 (AI Agent Tool Credential Harvesting), and AML.T0101 (Data Destruction via AI Agent Tool Invocation) represent precisely the class of threats that produce patterns in agentic telemetry before they produce discrete alerts in conventional security tooling. [5] The observatory’s event schema, processing layer, and identifier scheme are all calibrated to make those patterns detectable.

2. Data Sources

The AI Risk Observatory ingests telemetry from five source categories, each contributing a distinct visibility dimension. No single source provides complete coverage; the observatory’s analytical value emerges from the correlation of events across sources. The collection architecture is designed to support both real-time streaming ingestion for runtime telemetry and batch ingestion for periodic scan results and community reports.

RiskRubric Scans. RiskRubric is the CSAI Foundation’s automated security assessment tool for agentic deployments. It performs scheduled scans of agent configurations, permission profiles, tool inventories, and deployment manifests, evaluating them against the AICM v1.0 control framework and a continuously updated rule library. RiskRubric scans produce structured findings documents that capture control gaps, misconfigurations, and permission anomalies at the deployment level. As an observatory data source, RiskRubric contributes both baseline configuration data—establishing what an agent is supposed to do—and drift detection data when a rescan reveals that configurations have changed from the last known-good state. The observatory correlates RiskRubric findings with runtime telemetry to determine whether configuration anomalies have been accompanied by behavioral anomalies in production. RiskRubric scan results are ingested as ATE events with source_type: "riskscan" and linked to the relevant deployment identifier.

MCP Server Behavioral Logs. The Model Context Protocol (MCP) has become the dominant standard for connecting agentic AI systems to external tools and data sources since its introduction by Anthropic in 2024 and subsequent broad adoption by major AI platforms through 2025. [6] The November 2025 MCP specification introduced structured audit logging capabilities, but the security community has documented that most production MCP deployments do not yet implement standardized audit trails. [7] The observatory addresses this gap by providing a reference log collector for MCP server behavioral events—capturing tool call requests, parameter values, server responses, and timing metadata—and translating them into the observatory’s ATE schema. MCP behavioral logs are the highest-value source for detecting tool misuse patterns, as they capture the ground-truth record of what tools an agent actually invoked, with what parameters, and what the server returned. The collector implements redaction profiles to prevent sensitive parameter values from being transmitted to the observatory in cleartext, as specified in Section 7.

Enterprise Agent Deployment Logs. This source category encompasses the operational logs produced by enterprise agent orchestration platforms, including agent lifecycle events such as creation, configuration changes, and termination; session initiation and delegation events; resource utilization telemetry; and exception and error events from agent runtime environments. The specific schema varies by platform. The observatory provides ingestion adapters for the major enterprise agent orchestration platforms—including those using NemoClaw OpenShell, LangChain, AutoGen, and Amazon Bedrock Agents—and a generic adapter that accepts logs conforming to any schema that can be mapped to the ATE fields. NVIDIA NemoClaw’s OpenShell runtime is the highest-fidelity source in this category, as its out-of-process policy engine captures kernel-level system call telemetry that reveals behaviors invisible to application-layer logging. [8] The observatory’s NemoClaw adapter consumes OpenShell sandbox events and maps them to ATE action_taken and permissions_used fields.

Community-Submitted Incident Reports. The AI Incident Database (AIID), operated by the Responsible AI Collaborative, maintains a curated database of real-world AI incidents and has documented an increasing volume of agentic-specific cases through late 2025 and early 2026. [1] The observatory complements the AIID’s retrospective incident catalog with a structured incident submission portal designed for security professionals who have observed anomalous agent behaviors in production environments. Submissions are validated by the observatory’s AI processing layer for internal consistency before being added to the incident corpus, and confirmed incidents are assigned UAIR identifiers as described in Section 6. Community submissions are the primary source for incidents that occur in enterprise environments that do not have observatory telemetry connectors deployed, extending the observatory’s coverage beyond its direct instrumentation footprint.

Runtime Telemetry from NemoClaw OpenShell and Other Hardened Runtimes. This source category captures the high-fidelity, continuous behavioral telemetry that hardened agent runtimes produce during normal operation—distinct from the lifecycle and error events captured under enterprise deployment logs. NemoClaw OpenShell’s Landlock and seccomp event streams produce a millisecond-resolution record of every system call the sandboxed agent process attempts, whether permitted or blocked. [8] This stream is the most resource-intensive of the five sources and is processed with aggressive filtering and sampling before ingest; the observatory’s collection agent applies the same privacy-protective redaction profiles used for MCP logs before transmission. Other hardened runtimes contributing to this source category include Cisco’s DefenseClaw, which publishes network-level detection events for OpenClaw-derived agent deployments, and emerging runtime attestation frameworks that produce cryptographically signed event chains.

The following table summarizes the five source categories, their primary event types, their ingest mode, and the ATE fields they most directly populate.

Source Category	Primary Event Types	Ingest Mode	Primary ATE Fields Populated
RiskRubric Scans	Configuration findings, drift alerts, control gaps	Batch (scheduled)	`agent_identity`, `anomaly_indicators`, `permissions_used`
MCP Server Behavioral Logs	Tool call requests/responses, timing, errors	Stream or batch	`tools_invoked`, `action_taken`, `outcome`
Enterprise Deployment Logs	Lifecycle, session, delegation, error events	Stream	`session_context`, `agent_identity`, `outcome`
Community Incident Reports	Structured incident narratives	Batch (on-demand)	All fields (human-curated)
Runtime Telemetry (Hardened Runtimes)	Syscall events, policy decisions, network flows	Stream (sampled)	`action_taken`, `permissions_used`, `anomaly_indicators`

3. Standardized Agentic Telemetry Event Format

The Agentic Telemetry Event (ATE) schema is the observatory’s canonical data model. Every event that enters the observatory—regardless of source—is normalized into ATE format before being processed by the analysis layer. This normalization step is the fundamental design decision that enables cross-source correlation: an MCP tool call event and a NemoClaw sandbox event and a community incident report can all be represented in the same structure, queried with the same logic, and compared by the same anomaly detection algorithms.

The ATE schema is designed around seven top-level fields: agent_identity, action_taken, tools_invoked, permissions_used, outcome, anomaly_indicators, and session_context. Each field is a structured object with typed sub-fields. The schema is versioned using semantic versioning, and the observatory maintains backward compatibility across minor version increments. Breaking changes require a major version increment and a migration period during which both schema versions are accepted. The complete schema is defined as a JSON Schema document and maintained in the observatory’s public schema repository.

The agent_identity field captures the information needed to uniquely identify the agent instance that generated the event. It includes a agent_id sub-field containing a UUID assigned at agent instantiation time and persistent across the agent’s operational lifetime; an agent_type sub-field drawn from a controlled vocabulary that includes values such as "orchestrator", "subagent", "retrieval", "code_execution", and "tool_wrapper"; an owning_org sub-field containing the organizational identifier of the entity responsible for the agent’s deployment; and a version sub-field capturing the agent framework version and model version in use. The owning_org field is pseudonymized before transmission to the observatory unless the submitting organization has explicitly opted into identified reporting, as described in Section 7.

The action_taken field describes the action that generated the event. Its type sub-field is drawn from a controlled vocabulary that aligns with MITRE ATLAS tactic categories, including values such as "tool_invocation", "delegation", "memory_write", "credential_access", "external_api_call", "code_execution", and "data_read". The description sub-field contains a natural-language description of the action, subject to length limits and redaction of sensitive content. The intent sub-field captures the agent’s declared or inferred intent for the action, expressed as a short string; for actions generated by language model reasoning, this field is populated from the model’s intermediate reasoning output when that output is available. The intent field is explicitly probabilistic: the ATE schema includes an intent_confidence sub-field expressed as a float between 0 and 1, acknowledging that intent attribution in agentic systems is inherently uncertain.

The tools_invoked field is an array of tool call objects, each capturing a single tool invocation. Each tool call object includes the tool_name, tool_version, server_id (the identifier of the MCP server or tool provider), parameters (a JSON object of call parameters, subject to redaction), and result_summary (a structured summary of the tool’s response, not the full response body). The array structure accommodates multi-tool calls in which an agent invokes several tools in a single reasoning step, which is increasingly common in modern agentic architectures.

The permissions_used field captures the authorization context of the action. It includes a scopes array listing the permission scopes exercised; a credential_types array indicating the types of credentials used, such as "oauth_token", "api_key", "service_account", or "delegated_assertion"; and an elevation_from_baseline boolean that indicates whether the permissions used in this event exceed the permissions typically used by this agent in comparable contexts, as determined by the observatory’s behavioral baseline model.

The outcome field captures what the action produced. The status sub-field uses a three-value controlled vocabulary: "success", "failure", and "partial". The side_effects array captures consequential state changes produced by the action—for example, "file_written", "record_deleted", "email_sent", "external_api_modified"—using a controlled vocabulary maintained by the observatory. The error_code sub-field is populated for failure outcomes and maps to the observatory’s error taxonomy.

The anomaly_indicators field is populated by the observatory’s analysis layer rather than by the source collector, making it the one field that is not derived directly from the source event. It contains a deviation_score float representing the degree to which this event deviates from the agent’s learned behavioral baseline, normalized to a 0–100 scale; a rule_matches array identifying any observatory-maintained detection rules that matched this event; and a cluster_id string linking the event to an incident cluster if the analysis layer has grouped it with related events.

The session_context field captures the multi-agent and session-level context in which the event occurred. The session_id sub-field is a UUID identifying the agent session; the parent_agent_id sub-field identifies the orchestrating agent if this agent is a subagent; the delegation_chain array captures the full chain of agent identifiers from the original human principal through each intermediate agent to the current agent, in order; and the delegation_depth integer captures the number of delegation hops from the human principal, which is an input to the observatory’s risk scoring model. Research has documented that incidents involving deep delegation chains are systematically harder to investigate and attribute, making delegation depth a meaningful risk signal even in the absence of other anomaly indicators. [9]

The complete JSON Schema for ATE version 1.0 is as follows:

{
  "$schema": "https://json-schema.org/draft/2020-12",
  "$id": "https://observatory.csai.org/schemas/ate/v1.0.0",
  "title": "AgenticTelemetryEvent",
  "type": "object",
  "required": [
    "ate_version", "event_id", "timestamp",
    "agent_identity", "action_taken", "tools_invoked",
    "permissions_used", "outcome", "anomaly_indicators",
    "session_context"
  ],
  "properties": {
    "ate_version": { "type": "string", "const": "1.0.0" },
    "event_id": { "type": "string", "format": "uuid" },
    "timestamp": { "type": "string", "format": "date-time" },
    "source_type": {
      "type": "string",
      "enum": ["riskscan", "mcp_log", "deployment_log", "community_report", "runtime_telemetry"]
    },
    "agent_identity": {
      "type": "object",
      "required": ["agent_id", "agent_type", "owning_org", "version"],
      "properties": {
        "agent_id":    { "type": "string", "format": "uuid" },
        "agent_type":  { "type": "string",
                         "enum": ["orchestrator", "subagent", "retrieval",
                                  "code_execution", "tool_wrapper", "unknown"] },
        "owning_org":  { "type": "string", "description": "Pseudonymized org identifier" },
        "version":     {
          "type": "object",
          "properties": {
            "framework": { "type": "string" },
            "model":     { "type": "string" }
          }
        }
      }
    },
    "action_taken": {
      "type": "object",
      "required": ["type", "description"],
      "properties": {
        "type": {
          "type": "string",
          "enum": ["tool_invocation", "delegation", "memory_write", "credential_access",
                   "external_api_call", "code_execution", "data_read", "other"]
        },
        "description":        { "type": "string", "maxLength": 500 },
        "intent":             { "type": "string", "maxLength": 200 },
        "intent_confidence":  { "type": "number", "minimum": 0, "maximum": 1 }
      }
    },
    "tools_invoked": {
      "type": "array",
      "items": {
        "type": "object",
        "required": ["tool_name", "server_id"],
        "properties": {
          "tool_name":      { "type": "string" },
          "tool_version":   { "type": "string" },
          "server_id":      { "type": "string" },
          "parameters":     { "type": "object", "description": "Redacted call parameters" },
          "result_summary": { "type": "string", "maxLength": 300 }
        }
      }
    },
    "permissions_used": {
      "type": "object",
      "properties": {
        "scopes":                { "type": "array", "items": { "type": "string" } },
        "credential_types":      { "type": "array", "items": { "type": "string" } },
        "elevation_from_baseline": { "type": "boolean" }
      }
    },
    "outcome": {
      "type": "object",
      "required": ["status"],
      "properties": {
        "status":      { "type": "string", "enum": ["success", "failure", "partial"] },
        "side_effects": { "type": "array", "items": { "type": "string" } },
        "error_code":  { "type": "string" }
      }
    },
    "anomaly_indicators": {
      "type": "object",
      "properties": {
        "deviation_score": { "type": "number", "minimum": 0, "maximum": 100 },
        "rule_matches":    { "type": "array", "items": { "type": "string" } },
        "cluster_id":      { "type": "string" }
      }
    },
    "session_context": {
      "type": "object",
      "required": ["session_id"],
      "properties": {
        "session_id":       { "type": "string", "format": "uuid" },
        "parent_agent_id":  { "type": "string", "format": "uuid" },
        "delegation_chain": {
          "type": "array",
          "items": { "type": "string", "format": "uuid" }
        },
        "delegation_depth": { "type": "integer", "minimum": 0 }
      }
    }
  }
}

The schema’s controlled vocabularies—particularly the action_taken.type enum and the side_effects array values—are maintained by the observatory’s schema working group and published as separate normative documents that can be updated independently of the core schema. This design allows the vocabulary to grow in response to new threat patterns without requiring a major version increment of the schema itself.

4. AI-Assisted Processing Layer

The observatory’s AI-assisted processing layer sits between ingestion and output. It receives normalized ATE events, performs behavioral baseline maintenance and deviation scoring, executes incident clustering, generates trend forecasts, and identifies emerging risk patterns. The layer is itself an AI system with significant autonomy over observatory outputs, which creates a governance requirement that is addressed explicitly at the end of this section: the AI processing layer must be governed and secured according to the same standards it applies to the agentic systems it monitors.

Behavioral Baseline Maintenance and Deviation Scoring. For each agent identity tracked by the observatory, the processing layer maintains a behavioral baseline model derived from that agent’s historical ATE events. The baseline captures distributional statistics over the agent’s typical action types, tool invocation patterns, permission scope usage, outcome rates, and session context characteristics. Deviation scoring compares an incoming event against the baseline and produces the anomaly_indicators.deviation_score value that is written back into the event before it is stored or routed to downstream outputs. The baseline model uses an exponential decay function that weights recent events more heavily than historical ones, allowing it to adapt to legitimate changes in agent behavior over time—such as a model version upgrade that changes the agent’s tool call patterns—while still flagging abrupt deviations that are more likely to indicate an incident. Agents with fewer than 1,000 historical events receive a prior-informed baseline derived from the anonymized aggregate of similar agent types within the observatory corpus, clearly flagged in the anomaly_indicators metadata as a synthetic baseline.

Incident Clustering Methodology. Individual high-deviation-score events are useful signals, but many of the most significant agentic incidents are not represented by a single anomalous event. They are represented by a cluster of individually marginal events that share a causal pattern. The observatory’s clustering algorithm operates on a rolling 72-hour event window and applies a two-stage grouping approach. The first stage performs session-aware grouping, using the session_context fields to identify events that share session identifiers or delegation chain members. The second stage applies semantic similarity clustering to the action_taken.description and action_taken.intent fields using an embedding model, grouping events from different sessions that describe structurally similar actions across different agent instances. A cluster qualifies for investigation when it contains at least three events, has a mean deviation score above 40, and exhibits either a common rule match or a semantic similarity score above the cluster quality threshold. Qualified clusters are assigned cluster_id values and surfaced to the output layer for dashboard display and alert generation.

Trend Detection Algorithms. The trend detection component operates on the full corpus of stored ATE events rather than the rolling window used for clustering. It applies time-series decomposition to the aggregate rate of specific event types, rule match patterns, and UAIR-classified risk categories, separating secular trends from cyclical patterns and identifying statistically significant inflection points. A trend alert is generated when the rate of a monitored event category exceeds its predicted value by more than two standard deviations for three consecutive weekly periods. Trend detection results feed directly into the weekly risk digest described in Section 5.

Emerging Risk Pattern Forecasting. Pattern forecasting extends trend detection to identify risk categories that are not yet represented in significant volumes but whose early signals are consistent with the precursors of previously observed incident categories. The forecasting model is trained on the historical sequence of observatory events in the months preceding documented agentic incidents, and it uses that historical pattern data to assign probability scores to emerging risk categories. A forecast with probability above 0.65 for a novel risk pattern triggers the observatory’s emerging risk review workflow, in which a human analyst reviews the underlying events before any external communication is published. This human-in-the-loop requirement for emerging risk pattern forecasts is a deliberate governance control, not an operational efficiency choice, and is maintained even when the volume of candidate forecasts creates backlog pressure.

Governance and Security of the AI Processing Layer. The processing layer is itself an AI system operating with significant autonomy over the production of risk intelligence that will influence enterprise security decisions. This creates an accountability requirement that the observatory takes seriously. The processing layer’s models are retrained on a monthly schedule with human review of performance metrics before any updated model is promoted to production. All model predictions that flow into output artifacts—deviation scores, cluster assignments, trend alerts, emerging risk forecasts—are stored with the model version and confidence score that produced them, enabling retrospective audit of how any published risk finding was derived. The processing layer’s own action logs are ingested into the observatory as a separate event stream, applying the same ATE schema with agent_type: "tool_wrapper" and owning_org: "csai-observatory", so that anomalous processing layer behavior is subject to the same detection logic as any other agent in the system. Adversarial testing of the processing layer, including attempts to inject misleading events to manipulate cluster assignments and trend scores, is conducted quarterly by the observatory’s red team. [10]

5. Output Artifacts

The observatory produces three categories of output artifacts: real-time dashboards, weekly risk digests, and automated alerts. Each artifact category is designed for a distinct audience with distinct operational needs, and the access controls governing them are calibrated to balance transparency with the confidentiality requirements of contributing organizations.

Real-Time Dashboards. The observatory’s operational dashboard provides a live view of the agentic risk landscape as seen by the current event corpus. The primary panel displays a rolling 24-hour plot of event volume broken down by source type and action category, with deviation score distribution overlaid to give instant visibility into whether current activity is within normal bounds. A secondary panel displays active incident clusters—those clusters that have qualified for investigation and not yet been resolved—with a heat map representation of the MAESTRO layer taxonomy showing which layers are implicated in the most active clusters. A third panel displays the current state of UAIR records: identifiers that have been issued, their confidence scores, and their lifecycle status from initial triage through validated risk to archived record.

Access to the real-time dashboard is tiered. A public-facing summary view is available to any registered observatory member and displays aggregate statistics without organization-specific or event-specific detail. An analyst view, available to organizations that contribute telemetry to the observatory, provides access to events and clusters associated with their own agent deployments. A full operational view is restricted to CSAI Foundation staff and designated observatory operators, and provides access to the complete event corpus and processing layer diagnostics.

Weekly Risk Digests. The weekly digest is the observatory’s primary medium for communicating risk intelligence to the broader practitioner community. Each digest covers the seven-day period ending at midnight UTC on the preceding Friday. Its structure is fixed to enable comparison across editions: an executive summary paragraph describing the week’s most significant developments; a trend metrics table showing week-over-week changes in event volume, cluster count, and UAIR issuance rate; a feature section examining one or two risk patterns in depth; a UAIR listing of all new identifiers issued during the period; and a framework alignment note connecting the week’s observations to relevant AICM control domains and MITRE ATLAS techniques.

The digest is distributed to registered observatory members by email and published in the observatory’s public archive with a two-week delay. The delay allows contributing organizations to review whether their submitted data has been adequately anonymized before public publication. Organizations can request removal of specific events from the public corpus through the observatory’s data rights process, subject to the limitations described in Section 7.

Automated Alerts for Novel Vulnerability Patterns. Automated alerts are the observatory’s highest-urgency output channel. An automated alert is generated when any of four trigger conditions is met: a new incident cluster achieves a severity score above 80 and is assigned an initial UAIR record; a trend detection event identifies a statistically significant rate increase in a high-severity event category; an emerging risk forecast exceeds the 0.65 probability threshold and is confirmed by human analyst review; or a UAIR record’s confidence score is upgraded from probable to confirmed following validation. Alerts are distributed through three channels: a webhook API that allows registered organizations to integrate observatory alerts into their SIEM or SOAR platforms; an email distribution list with daily digest batching for lower-urgency alerts; and an RSS feed updated in real time for high-urgency alerts that are classified as time-sensitive. All automated alerts include the UAIR identifier associated with the triggering event, enabling recipients to query the observatory’s UAIR API for full record details.

6. Unified AI Risk Record Identifier Scheme

The Common Vulnerabilities and Exposures (CVE) program has provided the security community with an indispensable shared vocabulary for discrete, patchable software vulnerabilities since 1999. It was designed for a world in which vulnerabilities are properties of specific software components at specific versions, and it has served that purpose effectively. Agentic AI systems introduce a class of risk that does not fit the CVE model: behavioral risks arising from the interaction between agent architecture, training data, model alignment, tool configuration, and deployment context, which are not patchable in the CVE sense and which may manifest probabilistically rather than deterministically. [11]

The Unified AI Risk Record (UAIR) identifier scheme is the observatory’s complement to CVE for this class of risk. A UAIR identifier is assigned to a risk pattern—not a vulnerability in a specific component—when the observatory has collected sufficient evidence from the event corpus to characterize it with a defined minimum confidence level. The identifier format is:

UAIR-[YEAR]-[SEQUENCE]-[RISK_CLASS]

where [YEAR] is the four-digit year in which the identifier was first issued, [SEQUENCE] is a five-digit zero-padded sequence number unique within the year, and [RISK_CLASS] is a two-letter code drawn from the observatory’s risk class taxonomy: AG (agentic goal manipulation), DP (delegation and privilege), TC (tool chain exploitation), MB (memory and context), SC (supply chain), BV (behavioral violation), and MS (multi-system cascade). An example identifier is UAIR-2026-00047-TC, which would designate the forty-seventh tool chain exploitation risk pattern identified by the observatory in 2026.

Each UAIR record contains a structured set of fields maintained by the observatory and updated as new evidence accumulates. The description field provides a natural-language characterization of the risk pattern. The affected_architectures field lists the agent framework types and deployment patterns associated with the risk. The evidence_events field contains a count and summary of the ATE events that contributed to the record’s confidence assessment, with direct event identifiers accessible to contributing organizations for their own deployments. The atlas_techniques field maps the risk pattern to the relevant MITRE ATLAS technique identifiers. The aicm_controls field identifies the AICM control domains whose implementation would mitigate the risk. The mitigation_guidance field contains the observatory’s current best-practice recommendations for addressing the pattern.

Confidence scoring is a first-class property of the UAIR scheme, reflecting the inherently probabilistic nature of agentic risk assessment. Each UAIR record carries one of four confidence levels: indicative (early-signal events suggest a pattern but evidence is limited), probable (a qualified incident cluster with multiple contributing events meets the issuance threshold), confirmed (the pattern has been validated through independent incident reports or structured analysis), and superseded (the record has been merged into or replaced by another UAIR). Confidence levels are not permanent: a record may be downgraded from probable to indicative if the contributing cluster is subsequently explained by a benign cause, or upgraded from probable to confirmed when additional evidence accrues. Every confidence-level change is logged with a timestamp, the identifier of the analyst or processing layer component that made the change, and the evidence that justified the change.

Telemetry-driven updates are the mechanism by which UAIR records remain current in a rapidly evolving threat landscape. When new ATE events match the evidence profile of an existing UAIR record—as determined by the processing layer’s cluster association logic—the record’s evidence_events count is updated automatically. When new events suggest that an existing UAIR record’s mitigation guidance is incomplete or that the risk pattern has expanded to new architectural contexts, a human analyst review is triggered before the record is updated. This combination of automated evidence accumulation and human-gated content changes reflects the observatory’s governance model: the AI processing layer handles volume, and humans handle judgment.

The relationship between UAIR identifiers and the CVE program is one of complementarity rather than competition. Many agentic incidents involve both a UAIR-trackable behavioral risk pattern and one or more CVEs in specific software components. For example, the tool chain exploitation patterns documented by the observatory in connection with OpenClaw’s CVE-2026-25253 would carry both the CVE identifier for the specific authentication flaw and a UAIR-2026-TC identifier for the broader behavioral exploitation pattern that the CVE enabled. The observatory’s UAIR records explicitly reference associated CVE identifiers in a related_cves field, and the observatory publishes guidance on how to use the two identifier schemes together in the context of an incident investigation. Similarly, the UAIR scheme is designed to be compatible with emerging efforts in the OWASP Agentic Security Initiative and the AI Vulnerability Database to create shared vocabularies for agentic risk, with the expectation that identifier namespaces may be harmonized through standards processes as the field matures. [12]

7. Privacy and Data Handling

The observatory collects data about the behavior of AI systems that, in many cases, involves processing information derived from sensitive human activities. An MCP log capturing a healthcare agent’s tool invocations may contain protected health information in parameter values. A deployment log capturing an HR agent’s actions may contain employee data. A community incident report submitted by a financial services organization may inadvertently include customer account details. The observatory’s data handling architecture is designed around the principle that risk intelligence does not require retaining sensitive personal or organizational data, and that the intelligence value of any event record is concentrated in its structural and behavioral properties rather than in the content of the data it processed.

The observatory applies a mandatory redaction pipeline to all ingested events before they are stored in the event corpus. Parameter values in tools_invoked.parameters are processed by a content classifier that identifies and removes values matching patterns associated with personal information (names, email addresses, phone numbers, account numbers), authentication material (API keys, tokens, passwords), and proprietary organizational data (internal project names, system identifiers). Redacted fields are replaced with typed placeholders—for example, "[REDACTED: email_address]" or "[REDACTED: api_key]"—that preserve the structural information about what type of data was present without retaining the data itself. The redaction pipeline is run at the collection agent, before any event data leaves the contributing organization’s environment, so that sensitive content never transits the network to the observatory.

Data minimization extends beyond redaction to schema design. The ATE schema was deliberately designed to collect behavioral signals rather than content. The action_taken.description field has a 500-character limit and is subject to content filtering. The tools_invoked.result_summary field captures a structured summary rather than the full response body. The outcome.side_effects field uses a controlled vocabulary of effect categories rather than free-form description of what was affected. These design choices reflect a principle articulated in the EU AI Act’s Article 12 guidance: logging capabilities for high-risk AI systems should record what is necessary to identify risk-relevant events without creating a comprehensive surveillance record of system operation. [13]

Consent and contribution are based on an explicit organizational opt-in model. Organizations must register with the observatory and accept the data contribution agreement before any telemetry collection is enabled in their environment. The collection agent cannot be activated without a valid, current registration token. The data contribution agreement specifies what data categories will be collected, how they will be processed and retained, and what rights the contributing organization has over its submitted data. Contributing organizations may submit telemetry under identified reporting, in which their organization identifier is associated with their submitted events and they receive full access to those events through the analyst dashboard, or under pseudonymized reporting, in which their organization identifier is replaced with a stable pseudonym before storage and they lose the ability to query their own events by organization.

Retention periods are calibrated to the minimum necessary for the processing layer’s analytical functions. Raw ATE events are retained for 90 days at full fidelity, after which they are aggregated into statistical summaries and the individual event records are deleted. Incident cluster records—which do not contain individual event payloads but do reference event identifiers—are retained for two years. UAIR records are retained indefinitely, as they constitute the observatory’s institutional knowledge of historical risk patterns and are not event-level records. Contributing organizations may request deletion of their submitted events within the 90-day retention window through the observatory’s data rights portal, subject to the constraint that events that have been incorporated into a confirmed UAIR record are retained in anonymized form because their removal would degrade the integrity of the risk record.

Jurisdictional considerations are addressed through a data residency architecture that allows contributing organizations to direct their event data to regional observatory nodes. The observatory operates primary nodes in the European Union and the United States, with data residency boundaries enforced at the collection agent level. EU-resident events are processed exclusively on EU infrastructure and are never replicated to non-EU nodes, satisfying the GDPR Article 44 requirements applicable to personal data that may be present in event records despite the redaction pipeline. Organizations in regulated industries—healthcare, financial services, critical infrastructure—are advised to conduct their own data classification review of the ATE fields that will be populated by their agent deployments before activating telemetry collection.

8. Architecture Diagram Description

The observatory’s data architecture can be understood as four horizontal layers connected by two vertical pipelines, described here as if tracing the path of a single event from origin to output.

At the bottom of the architecture are the five data source categories, each represented as a distinct collection zone. On the left edge of the source layer sits the RiskRubric scan engine, which operates on a scheduled cadence and produces batch finding documents. Adjacent to it is the MCP behavioral log collector, which maintains persistent connections to MCP servers and captures a continuous stream of tool interaction events. In the center of the source layer are the enterprise deployment log connectors, one per supported agent orchestration platform, each implementing a platform-specific ingestion adapter. To their right is the NemoClaw OpenShell telemetry receiver, which accepts the high-frequency syscall and policy event stream from instrumented agent runtimes. At the far right of the source layer is the community incident portal, an asynchronous web interface through which practitioners submit structured incident reports.

From each collection zone, events flow upward into the collection agent tier—a distributed set of lightweight agents deployed at or near the source that perform three functions before any data leaves the source environment: schema normalization (translating the source’s native event format into the ATE structure), content redaction (applying the mandatory PII and secret-material redaction pipeline), and transmission (forwarding the normalized, redacted ATE event to the observatory’s ingest endpoint over a mutually authenticated TLS connection). The collection agent tier is the only component in the architecture that has access to unredacted event data, and it is explicitly designed to be deployable within the contributing organization’s security perimeter.

Above the collection agent tier is the ingest layer, which receives ATE events from collection agents worldwide, validates them against the ATE JSON Schema, assigns event identifiers, and routes them to the processing layer’s event bus. The ingest layer implements back-pressure and rate limiting to prevent high-volume runtime telemetry sources from starving the processing capacity available to community incident reports. A deduplication filter identifies and discards duplicate submissions of the same event—which can arise when a contributing organization’s collection agent retransmits events after a transient network failure—by comparing event content hashes against a 24-hour deduplication window.

The processing layer sits at the center of the architecture and is the most computationally intensive component. It receives events from the ingest layer’s event bus and executes four processing pipelines in sequence. The baseline maintenance pipeline updates the behavioral baseline model for the event’s agent_identity and computes the anomaly_indicators.deviation_score. The rule matching pipeline evaluates the event against the observatory’s library of detection rules and populates the anomaly_indicators.rule_matches field. The cluster assignment pipeline queries the active cluster index and either assigns the event to an existing cluster or initiates a new cluster candidate if the event’s characteristics suggest a new pattern. The UAIR correlation pipeline checks whether the event’s characteristics match the evidence profile of any existing indicative or probable UAIR records and, if so, updates those records’ evidence counts. Events exit the processing layer fully annotated with all anomaly_indicators sub-fields populated and any UAIR correlations recorded.

Processed events flow upward into the storage layer, which maintains two stores: the hot store—a time-series database optimized for the 90-day full-fidelity retention window and for the real-time dashboard’s query patterns—and the warm store—an analytical database that holds aggregated statistical summaries beyond the 90-day retention boundary and supports the trend detection and forecasting algorithms’ historical analysis queries.

On the right side of the architecture, rising vertically from the storage layer through the processing layer to the output layer, is the UAIR management pipeline. This pipeline handles the lifecycle of UAIR records: their initial creation by the processing layer’s cluster analysis, their confidence scoring and updates, the human analyst review workflow for content changes, and the API that exposes UAIR records to external consumers. The UAIR pipeline is the only component that writes to the UAIR record store, which is a separate, append-only datastore from the event corpus.

At the top of the architecture is the output layer, which queries the storage layer and the UAIR record store to produce the three artifact categories described in Section 5. The real-time dashboard queries the hot store on a 30-second refresh cycle. The weekly digest generation process runs as a scheduled batch job each Saturday, drawing on the preceding seven days of the hot store and the current state of all UAIR records. The automated alert engine is event-driven: it subscribes to notifications from the processing layer’s cluster pipeline and the UAIR management pipeline, generating alert payloads whenever the trigger conditions described in Section 5 are met.

9. Framework Alignment

The observatory’s components align with four external governance and technical frameworks: the CSA AI Controls Matrix (AICM) v1.0, the NIST Artificial Intelligence Risk Management Framework (AI RMF) 1.0, the CSA MAESTRO threat modeling framework, and the EU AI Act’s Article 12 automatic logging requirements. The table below maps each major observatory component to its primary alignment points across all four frameworks.

Understanding the alignment table requires brief context on each framework dimension. The AICM LOG domain comprises control objectives governing the design, implementation, and review of logging and monitoring capabilities for AI systems, making it the primary AICM domain for the observatory overall. [14] The NIST AI RMF’s MEASURE function encompasses the analytical activities that characterize and quantify AI risks, while the MANAGE function covers the actions taken in response to those risks—together they bracket the observatory’s primary operational scope. [15] MAESTRO Layer 5 (Evaluation and Observability) is the layer most directly relevant to telemetry architecture, while Layer 6 (Security and Compliance) captures the governance and regulatory alignment activities. [16] The EU AI Act’s Article 12 imposes on providers of high-risk AI systems a requirement to ensure that their systems can automatically record events relevant to identifying risk conditions and supporting post-market monitoring—an obligation that the observatory’s standardized ATE schema and UAIR scheme are directly designed to satisfy for agentic systems operating in high-risk deployment contexts. [13]

Observatory Component	AICM LOG Controls	NIST AI RMF Function	MAESTRO Layer	EU AI Act Alignment
ATE Schema (Section 3)	LOG-01 (Log Format Standards), LOG-02 (Event Coverage), LOG-05 (Field Completeness)	MEASURE 2.5 (Monitoring scope), MEASURE 2.6 (Testing/evaluation)	Layer 5: Evaluation and Observability	Art. 12(1): automatic recording of relevant events; Art. 12(2)(a): risk identification logging
Data Source Integration (Section 2)	LOG-03 (Source Coverage), LOG-04 (Collection Integrity), STA-06 (Supply Chain Telemetry)	MEASURE 2.7 (AI risk evaluation data), MAP 5.1 (Likelihood determination)	Layer 5: Evaluation and Observability; Layer 7: Agent Ecosystem	Art. 12(2)(b): post-market monitoring support; Art. 72: post-market monitoring obligations
AI-Assisted Processing Layer (Section 4)	LOG-06 (Anomaly Detection), LOG-07 (Automated Analysis), GRC-04 (AI Governance Controls)	MEASURE 2.8 (Risk tracking), MANAGE 1.3 (Response prioritization)	Layer 5: Evaluation and Observability; Layer 6: Security and Compliance	Art. 9(7): risk management system requirements; Art. 14(2): human oversight facilitation
Real-Time Dashboards (Section 5)	LOG-08 (Monitoring Interfaces), LOG-09 (Operational Visibility)	MANAGE 2.2 (Risk communication), MANAGE 3.1 (Risk response implementation)	Layer 6: Security and Compliance	Art. 26(5): deployer monitoring obligations
Weekly Risk Digests (Section 5)	LOG-10 (Periodic Reporting), GRC-07 (Risk Intelligence Sharing)	GOVERN 6.1 (Risk awareness), MANAGE 4.1 (Risk knowledge update)	Layer 6: Security and Compliance; Layer 7: Agent Ecosystem	Art. 9(8): updating risk management measures
Automated Alerts (Section 5)	LOG-11 (Alert Conditions), TVM-05 (Vulnerability Notification)	MANAGE 1.1 (Incident detection), MANAGE 2.4 (Incident response)	Layer 5: Evaluation and Observability	Art. 12(2)(a): risk identification; Art. 79: serious incident reporting
UAIR Identifier Scheme (Section 6)	TVM-01 (Vulnerability Tracking), TVM-06 (Risk Record Management)	MEASURE 2.10 (Risk characterization), MANAGE 4.2 (Risk knowledge management)	Layer 6: Security and Compliance	Art. 9(1): risk management system; Art. 72: post-market monitoring
Privacy and Data Handling (Section 7)	DSP-03 (Data Minimization), DSP-06 (Retention Policies), GRC-08 (Regulatory Compliance)	GOVERN 1.6 (Privacy risk), MAP 5.2 (Privacy impact)	Layer 2: Data Operations; Layer 6: Security and Compliance	Art. 10 (Data governance); GDPR Art. 5 (Data minimization); Art. 12(1) (Necessary logging)

The framework alignment table is intended as a navigation tool for practitioners implementing the observatory within the context of existing compliance programs. An organization subject to the EU AI Act’s high-risk AI provisions can use the table to identify which observatory components and configuration choices are directly relevant to their Article 12 compliance obligations. A security architect implementing the observatory as part of a NIST AI RMF program can use the table to identify which MEASURE and MANAGE sub-categories each observatory component addresses and where coverage gaps may exist. A CSAI working group member reviewing the observatory’s AICM control coverage can use the table to identify LOG domain controls that require further implementation detail in future versions of this specification.

References

Responsible AI Collaborative. “AI Incident Roundup — November and December 2025 and January 2026.” AI Incident Database. January 2026. https://incidentdatabase.ai/blog/incident-report-2025-november-december-2026-january/
OWASP GenAI Security Project. “OWASP Top 10 for Agentic Applications.” Version 1.0 / 2026 Edition. December 2025. https://genai.owasp.org/
National Institute of Standards and Technology, Center for AI Standards and Innovation. “New Report: Challenges to the Monitoring of Deployed AI Systems.” NIST, March 2026. https://www.nist.gov/news-events/news/2026/03/new-report-challenges-monitoring-deployed-ai-systems
OpenTelemetry Community. “Semantic Conventions for Generative AI Agentic Systems (gen_ai.*).” GitHub Issue #2664, open-telemetry/semantic-conventions. 2025. https://github.com/open-telemetry/semantic-conventions/issues/2664
MITRE ATLAS. “MITRE ATLAS v5.1 — Adversarial Threat Landscape for Artificial-Intelligence Systems.” MITRE Corporation. November 2025. https://atlas.mitre.org/
Anthropic. “Model Context Protocol Specification.” November 2025 revision. https://modelcontextprotocol.io/specification/2025-11-25
Schwanke, Axel. “Compliance under the EU AI Act: Best Practices for Monitoring and Logging.” Medium. 2025. https://medium.com/@axel.schwanke/compliance-under-the-eu-ai-act-best-practices-for-monitoring-and-logging-e098a3d6fe9d
NVIDIA Corporation. “NVIDIA NemoClaw and OpenShell Runtime Security Architecture.” NVIDIA GTC 2026 Technical Session. March 16, 2026.
Databahn AI. “AI Agents Security Incidents and Related CVEs for Enterprise Security Teams.” 2026. https://www.databahn.ai/blog/ai-agents-security-incidents-and-related-cves-for-enterprise-security-teams
MITRE ATLAS. “MITRE SAFE-AI: A Framework for Securing AI Systems.” MITRE Work Product MP250397. 2025. https://atlas.mitre.org/pdf-files/SAFEAI_Full_Report.pdf
CVE Program. “CVE ID Assignment and CVE Record Publication for AI-Related Vulnerabilities.” CVE Program Blog, Medium. 2025. https://medium.com/@cve_program/cve-id-assignment-and-cve-record-publication-for-ai-related-vulnerabilities-78a649bda815
OWASP GenAI Security Project. “Securing AI’s New Frontier: The Power of Open Collaboration on MCP Security.” April 2025. https://genai.owasp.org/2025/04/22/securing-ais-new-frontier-the-power-of-open-collaboration-on-mcp-security/
European Parliament and Council of the European Union. “Regulation (EU) 2024/1689 — Artificial Intelligence Act, Article 12: Record-Keeping.” August 2024. https://artificialintelligenceact.eu/article/12/
Cloud Security Alliance. “AI Controls Matrix (AICM) v1.0.” July 2025.
National Institute of Standards and Technology. “Artificial Intelligence Risk Management Framework (AI RMF 1.0).” NIST AI 100-1. January 2023. https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf
Cloud Security Alliance. “MAESTRO: Multi-Agent Environment, Security, Threat, Risk, and Outcome.” CSA, February 2025. https://cloudsecurityalliance.org/blog/2025/02/06/agentic-ai-threat-modeling-framework-maestro
National Institute of Standards and Technology. “Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations.” NIST AI 100-2e2025. 2025. https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.600-1.pdf
OpenTelemetry. “AI Agent Observability — Evolving Standards and Best Practices.” OpenTelemetry Blog. 2025. https://opentelemetry.io/blog/2025/ai-agent-observability/