APRIL 2026 · MICROSOFT OSS · MIT LICENSE

Microsoft Agent Governance Toolkit

github.com/microsoft/agent-governance-toolkitAnnouncement Blog Post

Open-source runtime security governance for autonomous AI agents. Seven independently installable packages that intercept every agent action and enforce policy at sub-millisecond latency. Covers all 10 OWASP Agentic AI risks.

This is not a native Azure service

The Agent Governance Toolkit is a separate open-source project (MIT license) you install alongside your Azure stack. It integrates with Azure AI Foundry, AKS, and Container Apps, but it is not built into any Azure service. You pip/npm install it and deploy it as middleware or a sidecar.

Agent OSPolicy engine — intercepts and evaluates every actionAgent MeshCryptographic identity and dynamic trust scoringAgent RuntimePrivilege rings, resource limits, kill switchAgent SRESLOs, circuit breakers, chaos engineeringAgent ComplianceRegulatory grading and signed attestationAgent MarketplacePlugin signing and supply chain verificationAgent LightningGovernance during RL training

Request Flow

How a single agent action traverses the full governance stack

Every agent action passes through multiple checkpoints before execution. Each checkpoint can independently block the request. Every decision logs to an immutable audit chain.

User / Application

Request initiated — tool call, query, or action

Agent OS

Policy check — is this action allowed? Pattern match + semantic intent classifier evaluate against rules

CAN DENY

Agent Mesh

Identity check — verify agent DID, confirm trust score meets threshold for this operation

CAN DENY

Agent Marketplace

Supply chain check — if action uses a plugin or MCP server, verify Ed25519 signature and manifest hash

CAN DENY

Agent Runtime

Privilege check — does the agent's ring allow this? Are resource limits (time, memory, rate) within budget?

CAN DENY

Agent SRE

Reliability check — is the circuit breaker closed? Is there error budget remaining?

CAN DENY

Tool / MCP Execution

Action executes in sandboxed environment with scoped, short-lived token

Agent Compliance

Output check — scan response for sensitive data, collect compliance evidence, write audit log

CAN DENY

Agent OS

The policy kernel — intercepts every action at sub-millisecond latency

Agent OS is the core of the toolkit. It sits in the execution path of every agent action and evaluates it against configurable rules. Stateless by design — deploy as an AKS sidecar, behind a load balancer, or serverless. Two evaluation layers: fast pattern matching and a semantic intent classifier that catches dangerous goals regardless of how they're phrased.

→ Action Arrives

Any tool call, API request, or agent decision

⚡ Pattern Matcher

Fast rule evaluation against YAML / OPA Rego / Cedar policies

🧠 Semantic Intent Classifier

ML classifier detects dangerous goals regardless of phrasing — catches what pattern matching misses

⚖ Conflict Resolver

When multiple rules match: deny-overrides, allow-overrides, priority-first, or most-specific-wins

↓ Decision

ALLOW

Proceed normally

DENY

Block + audit log

REQUIRE_APPROVAL

Human-in-the-loop

MASK

Proceed, redact sensitive fields

Policy language options:

YAML

Simple rules

OPA Rego

Complex conditional

Cedar

Fine-grained RBAC/ABAC

Agent Mesh

Cryptographic identity and trust scoring for agents

Agent Mesh answers: “Is this agent who it claims to be, and should it be trusted right now?” Every agent gets a cryptographic identity. Trust isn't binary — it's a score that changes based on behavior.

STEP 1 — IDENTITY CREATION

When an agent is created, it gets a Decentralized Identifier (DID) — think of it like an SSL certificate for agents. This is a cryptographic Ed25519 keypair that proves the agent is who it claims to be. No central authority needed.

DID Created

Unique identifier generated

Ed25519 Keypair

Public/private key for signing

Human Sponsor

A person accountable for this agent

STEP 2 — TRUST SCORING

Every agent has a trust score from 0 to 1000 that works like a battery — it drains over time without positive signals. Good behavior recharges it. Bad behavior (policy violations, anomalies) drains it faster. The score determines what the agent can do.

Untrusted

0–399

Standard

400–699

Elevated

700–899

Full Trust

900–1000

Score decays over time without positive signals. New agents start low and earn trust.

STEP 3 — AGENT-TO-AGENT COMMUNICATION (IATP)

When Agent A wants to talk to Agent B, it doesn't just “call” it. The Inter-Agent Trust Protocol (IATP) handles a verification handshake:

Agent A presents its DID (digital passport) to Agent B

Agent B verifies the DID signature cryptographically — is this really Agent A?

Agent B checks Agent A's trust score against its threshold — is A trusted enough?

If both pass → communication proceeds. If either fails → request denied.

STEP 4 — DELEGATION CHAINS

When an agent delegates work to another agent, scope always narrows — never widens. A parent with read+write can only give a child read. Revoking a parent auto-revokes all children.

Agent A

read + write

→

Agent B

read only

→

Agent C

read (subset)

Scope narrows at each delegation. Revoking Agent A auto-revokes B and C.

How is this implemented?

Agent Mesh is a library (pip package / npm module) you integrate into your agent framework. It handles the crypto, trust protocol, and delegation logic under the hood. Your agent code calls mesh.verify_peer() before communicating — the library does the DID verification, trust check, and scope enforcement automatically.

Agent Runtime

CPU-inspired privilege isolation for agents

Borrowed directly from how CPUs isolate processes. Four privilege rings with increasing access. New agents start at the outermost ring (most restricted) and earn their way inward through compliant behavior. Each ring enforces resource limits: max execution time, memory caps, CPU throttling, request rate.

NEW AGENTS

start here

Ring 0: Kernel

Ring 1: Supervisor

Ring 2: User

Ring 3: Untrusted

Ring 0 · Kernel (900–1000)

Full access. Can modify policies, terminate other agents.

Ring 1 · Supervisor (700–899)

Cross-agent communication, elevated tools.

Ring 2 · User (400–699)

Standard tools within own scope.

Ring 3 · Untrusted (0–399)

Read-only, fully sandboxed. Where new agents start.

SAGA ORCHESTRATION — MULTI-STEP ROLLBACK

When an agent runs a sequence of steps (draft email → send → update CRM) and a later step fails, compensating actions fire in reverse to undo earlier steps. Like a database transaction, but for agent actions.

Draft

✓

→

Send

✓

→

Update CRM

✗ FAIL

← Compensate: unsend email ← delete draft

Kill Switch

Ring 0 agents can terminate any other agent immediately. This is the emergency brake — if an agent goes rogue, it can be killed mid-execution without waiting for policy evaluation.

Agent SRE

Production reliability patterns applied to agent behavior

Agents treated like production services. The same error-budget model that SRE teams use for infrastructure is applied to agent behavior. When an agent burns through its error budget, it doesn't crash — its capabilities progressively restrict until it demonstrates recovery.

SLO FEEDBACK LOOP

Each agent has a Service Level Objective — for example, “99% of actions must comply with policy.” The 1% is the error budget. When violations burn through the budget, the agent's capabilities auto-restrict until it recovers.

📊

SLO Target

99% policy compliance (configurable per agent)

🔥

Error Budget Burns

Each policy violation consumes budget — tracked in real-time

🚫

Budget Exhausted

Agent capabilities auto-restrict — fewer tools, lower ring

🔄

Recovery

Compliant behavior over time restores budget and capabilities

CIRCUIT BREAKER STATES

Same pattern as microservices: consecutive failures trip the breaker, stopping the agent from hammering a failing resource. After a cooldown, a test request probes recovery.

CLOSED

Normal operation

→

OPEN

Failures tripped breaker — all calls blocked

→

HALF-OPEN

Cooldown done — test request sent

Chaos Engineering

9 templates: network delays, LLM failures, tool timeouts, trust manipulation, memory corruption

Progressive Delivery

Canary policy rollouts — 5% of agents first, validate, then expand to fleet

Cost Ceiling

Cap total API spend per session or agent. Prevents runaway token consumption.

Blast Radius Cap

Limit how many systems one agent failure can affect. Contain the damage zone.

Agent Compliance

Automated governance verification with signed attestation

Automated grading against regulatory frameworks. Evidence collection across all 10 OWASP categories. Signed attestation output for auditors or CI/CD pipeline gates. Designed to run in your deployment pipeline — deploy only if all controls pass.

Scan

Evaluate all 10 OWASP ASI risk categories automatically

Grade

Score against regulatory frameworks — EU AI Act, HIPAA, SOC2, NIST AI RMF

Collect Evidence

Gather proof artifacts for each control across all packages

Sign Attestation

Cryptographically signed report for auditors or CI pipeline gates

Pass / Fail Gate

Binary decision — deploy or block. Machine-readable JSON output.

Bootstrap Integrity

On startup, the compliance module hashes 15 governance modules and 4 critical enforcement functions to verify the governance layer itself hasn't been tampered with. The governance code governs itself.

Agent Marketplace

Supply chain security for agent plugins and tools

Agents discover and compose tools at runtime — this is a dynamic supply chain. Without verification, a compromised MCP server or swapped model endpoint could inject malicious behavior. Marketplace provides mandatory code signing and trust-tiered capability gating.

Think of it like an app store with mandatory code signing. Agents can't just use any plugin — every plugin must be signed, verified, and the agent's trust level determines what it can access.

Developer Signs Plugin

Ed25519 cryptographic signature applied to plugin package + manifest

Manifest Published

Declares capabilities, required trust tier, dependencies, and hash

Agent Discovers Plugin

Agent finds plugin at runtime — dynamic composition, not static config

Signature Verified

Marketplace verifies Ed25519 signature against manifest hash — tamper-proof

Trust Tier Unlocks Capabilities

Low-trust agents get read-only access. High-trust agents unlock full capabilities.

Why this matters

Agents discover components at runtime — this is a dynamic supply chain, not a static dependency list. Without signing and verification, a compromised MCP server or swapped model endpoint could inject malicious behavior mid-execution.

Agent Lightning

Governance for reinforcement learning training

Most governance focuses on production. Lightning extends it to training time — ensuring agents can't learn that policy violations are rewarding strategies during reinforcement learning.

During reinforcement learning training, agents explore actions to maximize reward. Without governance, they can learn that policy-violating strategies are effective. Lightning puts a policy fence around the training loop.

RL TRAINING WITH GOVERNANCE FENCE

Agent Explores

RL agent proposes an action during training

Policy Engine Intercepts

Agent OS evaluates the proposed action against policies — same engine as production

Violation Blocked

If the action violates policy, it's blocked before executing — the agent never learns it works

Reward Shaped

Negative reward signal for policy-violating attempts. Agent learns compliant strategies are optimal.

Zero policy violations during training

The key insight: if an agent never successfully executes a policy-violating action during training, it can't learn that violations are rewarding. Governance becomes part of the agent's learned world model.

OWASP Agentic Top 10

The risk taxonomy the toolkit is built against — hover to explore

Published December 2025 by 100+ experts. The first formal taxonomy of autonomous agent risks. Every risk maps to a specific package in the toolkit.

ASI01Goal HijackAgent OS

ASI02Tool MisuseAgent OS

ASI03Identity AbuseAgent Mesh

ASI04Supply ChainMarketplace

ASI05Data LeakageAgent OS

ASI06Memory PoisonAgent OS

ASI07Multi-AgentAgent Mesh

ASI08Cascading FailAgent SRE

ASI09Trust ExploitAgent Runtime

ASI10Rogue AgentsAgent Runtime

Core principle: Least Agency

Only grant agents the minimum autonomy, tool access, and credential scope required — and no more. Agentic failures aren't “bad output” — they're bad outcomes with real system consequences.

Deploy on Azure

Three paths to production

The toolkit is framework-agnostic but designed for Azure. Three deployment options.

☸ AKS Sidecar

Deploy the policy engine as a sidecar container alongside your agents. Transparent governance — agents don't even know it's there. Best for teams already running Kubernetes.

🏗 Foundry Agent Service

Built-in middleware integration for agents built on Microsoft Foundry. Zero infrastructure setup — governance plugs into the managed runtime.

📦 Container Apps

Run governance-enabled agents in a serverless container environment. No cluster management — Azure handles scaling. Best for smaller teams.

April 2026 · Source: Microsoft OSS Blog · GitHub