AI Governance

How Do You Prove Your Agents Are Governed?

Regulators and auditors want proof our agents are governed. What do we actually hand them?

7 min · Updated June 2026

Part 5 of 6 — “Governing AI Agents in the Enterprise: A Practical Architecture Guide”

This article covers accountability governance: audit trails, compliance evidence, runtime content guardrails, and how to draw a clean line between tools so nothing falls through the gaps.

5.1 - Regulators and auditors want proof our agents are governed. What do we actually hand them?

The pattern: Tamper-evident audit and compliance evidence.

Diagram of a tamper-evident audit chain — cryptographically chained and signed append-only log entries mapping agent actions to compliance controls

Having controls is not the same as being able to proveyou had them. When a regulator, an auditor, or a court asks “what did this agent do, who authorized it, and what controls were in place” — the answer must be evidence that is complete, attributable, and demonstrably un-tampered. Ordinary application logs and even observability traces are not built for this. They can be altered. They are not mapped to control frameworks.

The Microsoft Agent Governance Toolkit produces an append-only audit log in which each entry is cryptographically chained and signed — so any later alteration or deletion is detectable. Every policy decision, identity assertion, and significant action is recorded. The toolkit’s compliance module verifies coverage against the OWASP Agentic Security Initiative top-ten and maps controls to regulatory frameworks including the EU AI Act, HIPAA, and SOC 2 — emitting signed attestations that can be produced as evidence.

A verification step runs in CI on every release, so the compliance evidence is continuously refreshed rather than assembled in a panic before an audit.

What “cryptographically chained” actually means

Ordinary application logs are mutable: a row can be deleted or overwritten without a trace. A chained audit log makes any such tampering detectable because each entry includes the SHA-256 hash of the previous entry. Edit or remove one entry and every subsequent hash in the chain becomes invalid — the break is provable.

audit_chain.py — tamper-evident append-only log

import hashlib, json, hmac

def append(entry: dict, prev_hash: str, key: bytes) -> dict:
    entry["prev"] = prev_hash
    body = json.dumps(entry, sort_keys=True).encode()
    entry["hash"] = hashlib.sha256(prev_hash.encode() + body).hexdigest()
    entry["sig"]  = hmac.new(key, entry["hash"].encode(), hashlib.sha256).hexdigest()
    return entry     # any later edit breaks the chain from that point forward

# verify: recompute each hash and confirm entry['prev'] == previous entry['hash']

What you hand an auditor — four artifacts

#	Artifact	What it proves
1	Signed attestations	Controls exist and map to the required framework (EU AI Act, HIPAA, SOC 2)
2	Chained audit log	Every agent action was recorded and the record has not been altered
3	Control-to-framework map	Coverage against the OWASP ASI Top 10 and each applicable regulatory clause
4	CI verification run	Compliance was checked continuously, not assembled in a panic before the audit

It is important to keep roles distinct: observability traces are for debugging and analysis — they are not tamper-evident. The governance audit log is the system of record for compliance. The two serve complementary purposes and must not be confused.

Diagram comparing observability traces for debugging versus tamper-evident governance audit logs as the system of record for compliance

Real-world examples

Banking under the EU AI Act -- credit-decisioning support agent

The regulator asks the bank to demonstrate governance of an AI system that influences credit decisions. The bank produces signed attestations mapping its controls to the EU AI Act, plus a cryptographically verifiable audit trail showing, for each decision, which policy applied and where a human signed off.

Healthcare under HIPAA -- clinical-support agent

During a compliance review, the provider must show that every access to protected health information by the agent was authorized and logged. The cryptographically chained audit log provides a complete, tamper-evident record, and the compliance mapping demonstrates alignment with HIPAA safeguards.

Public company under SOC 2 -- internal operations agent

An external SOC 2 auditor needs evidence that automated agents operate within defined controls. The continuously generated, signed compliance attestations and the verifiable audit log become direct audit artifacts — shortening the audit and reducing findings.

Diagram of the compliance evidence pipeline — from agent actions through tamper-evident audit logs to signed attestations for regulators and auditors

5.2 - What about prompt injection, jailbreaks, and PII leaking out of the model -- do these tools cover that?

The pattern: Closing the runtime content-guardrails gap — a known limitation.

Diagram illustrating the runtime content-guardrails gap — the area not covered by evaluation, observability, and governance tools alone

This question is included because honesty about architecture limits is part of good design. The three core tools in this stack — evaluation, observability, governance — do not fully cover runtime content guardrails: inspecting the actual text flowing in and out of the model on every live request to block prompt injection, detect jailbreak attempts, or filter personally identifiable information in real time.

DeepEval can test for these weaknesses before release.
The policy engine can block actions triggered by these attacks.
But neither inspects and filters live model content on every turn.

This gap must be filled explicitly. It is not a deficiency to ignore. It is handled in two parts:

Pre-production (red-teaming). A red-teaming capability runs adversarial attack suites — prompt-injection, jailbreak, data-exfiltration probes — as part of CI. This catches systematic weaknesses before release. The governance toolkit also offers red-team scanning of prompts.

ci_redteam.py — adversarial suite as a CI release gate (pip install deepteam)

from deepteam import red_team
from deepteam.vulnerabilities import PromptLeakage, PIILeakage
from deepteam.attacks.single_turn import PromptInjection, Jailbreaking

risk = red_team(
    model_callback=my_agent_callback,
    vulnerabilities=[PromptLeakage(), PIILeakage()],
    attacks=[PromptInjection(), Jailbreaking()],
)
assert risk.pass_rate >= 0.9   # fail the build if the agent is too easily jailbroken

Runtime (a fourth, lightweight component). Add a dedicated content-guardrail component — for example, an open-source guardrails library or a safety classifier model — at the agent framework’s model callbacks. It inspects each prompt before it reaches the model and each response before it reaches the user or a tool, blocking or redacting as needed.

guardrail_callbacks.py — PII redaction + injection block at the model callback

from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
analyzer, anonymizer = AnalyzerEngine(), AnonymizerEngine()

INJECTION = ("ignore previous", "disregard instructions", "reveal system prompt")

def before_model_callback(prompt: str) -> str:
    if any(p in prompt.lower() for p in INJECTION):
        raise ValueError("Prompt-injection pattern blocked")   # or route to Llama Guard
    return prompt

def after_model_callback(response: str) -> str:
    findings = analyzer.analyze(text=response, language="en")
    return anonymizer.anonymize(text=response, analyzer_results=findings).text  # redact PII

Libraries and frameworks — reference

Library / Framework	Role
Microsoft AGT (audit + compliance)	Append-only signed log; maps controls to EU AI Act, HIPAA, SOC 2; OWASP ASI Top 10 coverage
Microsoft Presidio	PII detection and redaction at the model callback
NeMo Guardrails, Guardrails AI, LLM Guard, Llama Guard	Input/output filtering, jailbreak and topic control — pick one as the “fourth component”
deepteam (DeepEval)	Red-teaming: injection, jailbreak, data-exfiltration probes in CI
OWASP Agentic Security Initiative (ASI) Top 10	The coverage checklist to map controls against

Comparison of DeepEval and Langfuse roles — DeepEval for pre-production evaluation and red-teaming, Langfuse for production observability and prompt management

Real-world examples

Banking -- customer-service agent

A customer message contains hidden instructions attempting to make the agent reveal another account’s details. The runtime guardrail detects the injection pattern and strips it before the model ever processes it. The pre-production red-team suite ensures this class of attack was tested against every release.

Healthcare -- symptom-checker agent

Before any model response is shown to a patient, the runtime guardrail scans for and redacts any PII that should not appear, and blocks responses that drift into definitive diagnosis the agent is not permitted to give.

Retail -- conversational shopping agent

Adversarial users try jailbreak prompts to extract internal pricing logic or discount rules. Red-team testing in CI quantifies the agent’s resistance release over release; the runtime guardrail blocks live jailbreak attempts in production.

5.3 - DeepEval does evaluation and so does Langfuse. We don't want two teams building the same thing. Who owns what?

The pattern: Separation of concerns — the evaluation engine versus the data plane.

Both DeepEval and Langfuse offer evaluation features. Left unmanaged, two teams build overlapping pipelines, ownership blurs, and dashboards disagree. The pattern is to draw a hard boundary based on what each tool does best, and enforce it.

DeepEval owns	Langfuse owns
Research-grade metric library	Trace storage and dashboards
Multi-turn simulation	Prompt management
Synthetic dataset generation	Canonical datasets and human annotation queues
Red-teaming suites	Lightweight online evaluation (LLM-as-judge on production traffic)
Pre-production CI gate — build / no-build decisions	Production trace store — single pane for all scores

They cooperate through a documented interface: DeepEval pushes its metric scores onto corresponding Langfuse traces through Langfuse’s score API. The result is rich, metric-specific scores from the purpose-built evaluation engine, surfaced inside the single observability pane teams already use.

The rule of thumb: the CI quality gate is DeepEval; the production trace store and dashboards are Langfuse. Do not let either drift into the other’s lane.

Real-world examples

B2B SaaS platform company

The platform team mandates the split in internal golden-path documentation: feature teams write DeepEval tests for their agents (CI gate) and instrument with Langfuse for tracing. No team builds a bespoke evaluation pipeline, and every agent’s scores appear in the same Langfuse dashboards.

Media and publishing

An editorial-assistant agent is evaluated for factual accuracy and style. Deep, expensive accuracy checks run in CI via DeepEval; cheap, broad style-and-tone checks run continuously on production traffic via Langfuse. The two layers are complementary, not duplicative.

Fintech scale-up

As the team grows, the documented boundary prevents the classic failure where the platform team and a product team independently build evaluation tooling. One interface — DeepEval scores flowing into Langfuse — keeps a single source of truth.

5.4 - Leadership said 'use Microsoft's agent governance,' but we're multi-cloud and not on Azure. Is that even possible?

The pattern: Distinguish the product from the framework — and choose the cloud-agnostic one.

This is a common and consequential source of confusion. “Microsoft agent governance” refers to two different things from two different parts of Microsoft, and conflating them leads to an architecture that violates a multi-cloud or sovereignty requirement.

Microsoft Entra Agent ID and Microsoft Agent 365 are commercial SaaS offerings. They are directory- and platform-bound to Microsoft Entra and Microsoft 365. They are excellent if your organization lives in Microsoft 365 — but they are not cloud-agnostic.
The Microsoft Agent Governance Toolkit (AGT) is a separate, open-source (MIT-licensed) project: a set of governance libraries — policy engine, identity, runtime sandboxing, reliability, compliance — that install into your application and run anywhere: any cloud, on-premises, hybrid, air-gapped. It is framework-agnostic by design.

For a cloud-agnostic mandate, adopt AGT and decline Entra Agent ID and Agent 365. One caveat: some of AGT’s documentation tutorials default to Azure deployment examples. Use the framework-agnostic APIs, which carry no cloud dependency.

Real-world examples

Multi-cloud regulated bank

The bank runs workloads across two clouds plus on-premises, and forbids new single-cloud dependencies. AGT satisfies the governance mandate without violating the policy; Entra Agent ID would have introduced exactly the cloud lock-in the architecture board prohibits.

Government agency with data-sovereignty requirements

The agency must run agents in a sovereign or air-gapped environment where Microsoft 365 SaaS is not available. AGT, being self-hostable libraries, runs inside the sovereign boundary.

Hybrid healthcare network

Clinical systems remain on-premises while other workloads are in a public cloud. AGT provides one consistent governance layer across both — whereas a cloud-bound SaaS governance product would leave the on-premises clinical agents ungoverned.

The picture so far

Across the last two articles, the governance layer has addressed:

Capability	Question
Preventive action enforcement	5 — Policy engine
Agent attribution and trust	6 — Cryptographic identity
Execution containment	7 — Sandboxing
Cascading failure and emergency stop	8 — Reliability module
Tamper-evident accountability	9 — Audit and compliance evidence
Live content threats	10 — Runtime guardrails (fourth component)
Tool ownership boundaries	11 — DeepEval vs Langfuse split
Cloud-agnostic governance	12 — AGT vs Entra / Agent 365

The final article shows how to connect all three disciplines into a single system — and gives a sequenced adoption plan so you know what to build first.

Found this useful?