GitHub - DeadpxlStudio/ModelActionProtocol: MAP (Model Action Protocol) — Cryptographic provenance, self-healing critic, and state rollback for autonomous AI agents. 2.5k lines of TypeScript, 60+ tests, MIT.

Cryptographic provenance, self-healing critique, and state rollback for autonomous AI agents.

MCP gave Claude the hands. MAP gives Claude the receipt.

MAP v0.1 spec is frozen. Wire format, JCS canonicalization, and conformance fixtures are immutable at spec v0.1.0. Two reference implementations conform: TypeScript (@model-action-protocol/core, npm) and Python (model-action-protocol, PyPI). A ledger written in either language verifies byte-identical in the other — pinned by 6 frozen conformance fixtures.

Try it in 30 seconds

Paste your agent's ledger into the MAP Verifier and see what happened — every action, every critic verdict, every state diff, with chain integrity verified end-to-end.

Web verifier: verify.modelactionprotocol.org

CLI:

npx -p @model-action-protocol/cli map verify ./agent.ledger.json
# or pipe a ledger from your agent:
your-agent | jq '.ledger' | npx -p @model-action-protocol/cli map verify -

Don't have a ledger yet? Try the conformance sample — or a tampered version (can you spot the corruption?).

Repo layout

Workspace monorepo. Two reference implementations of the spec, plus the verifier wedge:

Path	What it is	Published as
`spec/`	Frozen v0.1 wire-format spec + 6 conformance fixtures	— (canonical)
`src/` (root)	TS reference impl — `MAP` class, ledger, critic, tool builders	`@model-action-protocol/core` (npm)
`python/`	Python reference impl — `Map` orchestrator, ledger, critic, reverser registry	`model-action-protocol` (PyPI)
`packages/cli/`	`map verify` CLI (and, soon, `map mcp wrap`)	`@model-action-protocol/cli` (npm)
`apps/verifier/`	Hosted web verifier (Next.js)	runs at `verify.modelactionprotocol.org`
`examples/tools-stripe/`	Reference tool integration	—
`scripts/generate-fixtures.ts`	Regenerates the v0.1 conformance fixtures (run once at freeze)	—

# install TS workspaces (core + cli + verifier)
npm install

# build core + cli
npm run build && npm run build:cli

# run the verifier locally on http://localhost:4040
npm run dev:verifier

The Problem

AI agents are entering Phase 3: autonomous execution. They schedule reasoning, call tools, execute multi-step processes, and verify their own results — all without a human in the loop.

But there's no liability shield.

Phase	Era	Who's Responsible?
Phase 1: Chatbots (Nov 2022)	Model generates answers from prompts	Human — they read and act on the output
Phase 2: Reasoning (Sep 2024)	Model reasons before answering, reduces errors	Human — still driving step by step
Phase 3: Agents (2025-2026)	Agent executes autonomously, verifies its own work	Nobody — the human is abstracted away

When an agent deletes a production database, sends the wrong email, or misconfigures infrastructure — who's accountable? How do you prove what happened? How do you undo it?

And when fleets of agents work simultaneously — who authorized what, who spawned whom, and which agent broke which thing?

MAP is the OS-level liability shield for autonomous agents.

What It Does

1. Cryptographic Provenance Ledger

Every agent action is logged to an append-only, SHA-256 hash-chained ledger:

Full state snapshots before and after every action
Tamper-evident — change one entry, all subsequent hashes break
Exportable as JSON for audit compliance
Chain verification in one call

2. Self-Healing Critic Loop

After every action, a fast critic model reviews the result:

PASS — Action is correct, continue
CORRECTED — Error detected, auto-fixed, both logged
FLAGGED — Dangerous action, execution halts for human review

Uses tiered model routing: expensive model executes, cheap model critiques.

3. Reversal Schema (COMPENSATE / RESTORE / ESCALATE)

Every MAP-compliant tool declares how its actions can be reversed:

Strategy	When	How Rollback Works	Limitations
COMPENSATE	Systems that don't allow hard deletes (ERPs, accounting)	Dispatch a compensating action (e.g., credit memo for duplicate invoice)	Matches how regulated industries already work (banks post reversing entries, ERPs issue credit memos). The strongest strategy.
RESTORE	CRUD APIs with GET + PUT	Captures state before write via `tool.capture()`, pushes original state back on rollback via `tool.restore()`	Concurrent modification risk: if another process modifies the same record between action and rollback, restore blindly overwrites their changes (last-write-wins). Best for single-writer environments.
ESCALATE	Irreversible actions (wire transfers, emails, deploys)	Intercepts before execution — the tool never runs without human approval	Not a rollback strategy. This is a prevention gate. The correct answer for actions that genuinely cannot be undone.

What rollback can't do:

Side effects that left the system. An email was sent and read. A Slack message was delivered. No rollback fixes that — ESCALATE is the right strategy for these actions.
Distributed state across multiple services. If an agent updated Stripe AND Salesforce AND sent a notification, rolling back one without the others leaves inconsistent state. Coordinated multi-service rollback is a v0.2 problem.
Time-sensitive operations. A stock was sold at $100. By the time rollback runs, the price is $87. COMPENSATE can issue a reversing trade, but the economic outcome is different.

MAP makes rollback possible and structured for the majority of agent actions that are API CRUD operations. For the rest, ESCALATE gates them before execution.

4. State Rollback

Revert to any prior point in the ledger:

The rollback itself is logged to the provenance chain
Rollback doesn't delete history — it preserves the full chain and adds a revert entry
For RESTORE tools, rollback calls tool.restore() against the external system — not just in-memory snapshot restoration

5. Multi-Agent Provenance (KYA — Know Your Agent)

When fleets of agents work simultaneously, MAP tracks everything:

Agent Identity — every agent has a cryptographic identity:

{
  agentId: string;
  ownerId: string;           // org/user that owns this agent
  ownerDomain: string;       // e.g., "customer.com"
  capabilities: string[];    // what this agent can do
  credentialHash: string;    // SHA-256 of auth credential
}

Authorization Grants — cross-boundary trust:

{
  grantor: AgentIdentity;    // Agent A (requesting)
  grantee: AgentIdentity;    // Agent B (executing)
  scope: string[];           // specific actions authorized
  constraints: {};           // e.g., max amount, time window
  expiresAt?: string;        // when this grant expires
  parentGrantId?: string;    // delegation chain
  hash: string;              // tamper-evident
}

Ephemeral Agent Lifecycle — spawn tree tracking:

{
  agentId: string;
  parentAgentId?: string;    // who spawned this agent
  spawnedAt: string;
  terminatedAt?: string;     // null if still alive
  purpose: string;           // why this agent exists
  isEphemeral: boolean;      // auto-terminate when done
  childAgentIds: string[];   // sub-agents spawned
}

Every ledger entry carries agentId, parentEntryId, and lineage[] — the full chain from root agent to the action.

6. Human-on-the-Loop Approval

Corrections can require human sign-off before proceeding:

pending → action awaits review
approved → human confirmed, logged to chain
rejected → human rejected, rollback required

Approval is a separate concern from entry status — clean separation.

MCP + MAP: The Complete Picture

	MCP	MAP
Direction	Input — what the agent can see	Output — what the agent did
Purpose	Capability	Accountability
Analogy	Git (version control)	GitHub (collaboration + audit)

MCP defines how agents read the world. MAP defines how agents safely write to it. Together they complete the picture for enterprise-grade autonomous agents.

Installation

TypeScript

npm install @model-action-protocol/core

Requirements: Node.js 20+, TypeScript 5.7+. Current version: 0.2.0 (breaking change from 0.1.x — JCS canonicalization adopted; see CHANGELOG).

Python

pip install model-action-protocol
# or with extras:
pip install "model-action-protocol[anthropic,sqlite,postgres,fastapi]"

Requirements: Python 3.10+. Current version: 0.1.0. See python/README.md and python/DESIGN.md. Walkthrough: python/examples/quickstart.ipynb. HTTP demo: python/examples/fastapi_app/.

Specification

The wire format is defined in spec/SPEC.md. Both implementations conform. Conformance fixtures live at spec/fixtures/v0.1/ — immutable, version-bumped only via a new spec release.

Persistence (Optional)

By default, the ledger lives in memory. For production, MAP ships two pluggable storage adapters. Both are optional peer dependencies — install whichever you need.

PostgreSQL

import { MAP } from '@model-action-protocol/core';
import { PostgresLedgerStore } from '@model-action-protocol/core/postgres';

const store = new PostgresLedgerStore({
  connectionString: process.env.DATABASE_URL,
  tableName: 'ledger_entries', // optional, default: 'ledger_entries'
  sessionId: 'default',        // optional, for multi-tenant isolation
});

const map = await MAP.load({ ...config, store }, critic);

Connection pooling, JSONB entries, concurrent-write retry logic. Contributed by @mel-cell.

SQLite

npm install better-sqlite3

import { MAP } from '@model-action-protocol/core';
import { SQLiteLedgerStore } from '@model-action-protocol/core/sqlite';

const store = new SQLiteLedgerStore('./ledger.db');

const map = await MAP.load({ ...config, store }, critic);

WAL mode, prepared-statement caching, atomic transactions. Good for single-node deployments.

Use MAP.load() instead of new MAP() when using a persistent store — it reads any existing entries on startup. new MAP() skips that step.

Quick Start

import { MAP, createRuleCritic } from '@model-action-protocol/core';
import { z } from 'zod';

// Your state
const database: Record<string, any> = {
  acme:   { id: "acme", name: "Acme Corp", price: 500 },
  globex: { id: "globex", name: "Globex Inc", price: 500 },
};

// 1. Create a critic
const critic = createRuleCritic([
  {
    name: 'no-zero-prices',
    check: ({ stateAfter }) => {
      const state = stateAfter as Record<string, any>;
      const bad = Object.values(state).find((c) => c.price === 0);
      if (bad) {
        return {
          verdict: 'CORRECTED',
          reason: `${bad.name} price was set to $0`,
          correction: { tool: 'updatePrice', input: { customerId: bad.id, price: 299 } },
        };
      }
      return null;
    },
  },
]);

// 2. Initialize MAP
const map = new MAP(
  { executor: 'claude-sonnet-4.6', critic: 'claude-haiku-4.5' },
  critic
);

// 3. Register tools
map.registerTool(
  'updatePrice', 'Update a customer price',
  z.object({ customerId: z.string(), price: z.number() }),
  async ({ customerId, price }) => {
    database[customerId].price = price;
    return { updated: customerId, newPrice: price };
  }
);

// 4. Connect state
map.connectState(
  () => JSON.parse(JSON.stringify(database)),
  (state) => Object.assign(database, state),
);

// 5. Execute with full provenance
await map.execute('Migrate pricing', 'updatePrice', { customerId: 'acme', price: 299 });

// 6. Rollback if needed
const ledger = map.getLedger();
await map.rollbackTo(ledger[0].id);

// 7. Export for audit
const audit = map.exportLedger();
// → { protocol: 'map', version: '0.1.0', entries: [...], stats: {...} }

// 8. Verify chain integrity
map.verifyIntegrity(); // → { valid: true }

Python — same scenario

from map import Action, CriticResult, Map, rule_critic, verify_chain

# Your state — a tiny "orders database" stand-in for a real backend.
ORDERS: dict[str, dict] = {}

def place_order(item_id: str, quantity: int) -> dict:
    order_id = f"O-{item_id}-{quantity}-{len(ORDERS)}"
    record = {"orderId": order_id, "item_id": item_id, "quantity": quantity, "status": "open"}
    ORDERS[order_id] = record
    return record

def cancel_order(action: Action, output) -> dict:
    ORDERS[output["orderId"]]["status"] = "cancelled"
    return {"orderId": output["orderId"], "cancelled": True}

# 1. Critic — flag any order over 100 units
def disallow_huge(action, sb, sa):
    if action.input.get("quantity", 0) > 100:
        return CriticResult(verdict="FLAGGED", reason="quantity over 100 needs approval")
    return None

# 2. Wire up MAP with critic + reverser
m = Map()
m.set_critic(rule_critic([disallow_huge]))
decorated = m.reversible(reverser=cancel_order)(place_order)

# 3. Execute — every call is a verifiable ledger entry
output = decorated(item_id="SKU-A", quantity=2)
entry = m.execute(Action(tool="place_order", input={"item_id": "SKU-A", "quantity": 2}, output=output))
# → entry.critic.verdict == "PASS"

# 4. Roll back — reverser fires, world state flips
m.rollback_to(entry.id)
assert ORDERS["O-SKU-A-2-0"]["status"] == "cancelled"

# 5. Audit export + chain verification
chain = [e.model_dump(by_alias=True, exclude_none=True) for e in m.get_entries()]
assert verify_chain(chain) == {"valid": True}

For a full narrated walkthrough — including LearningEngine, persistent stores, and the Anthropic SDK integration — see python/examples/quickstart.ipynb. For an HTTP service template, see python/examples/fastapi_app/.

Cross-language conformance

MAP's claim isn't "we have a TypeScript library and a Python library." It's that the wire format is the spec, both implementations conform, and ledgers are byte-identical across them. The proof is in spec/SPEC.md §6.4 — a worked example with a hand-computed SHA-256 that an automated test on each side asserts equality against.

// TypeScript — Open Source/src/snapshot.ts
import { computeEntryHash, sha256, serializeState } from "@model-action-protocol/core";

const stateHash = sha256(serializeState(null));
const entryHash = computeEntryHash(
  0,
  { tool: "ping", input: {}, output: "pong" },
  stateHash, stateHash,
  "0".repeat(64),
  { verdict: "PASS", reason: "ok" }
);
// → "25d29bc25a183ebdb29b70b6a03ed2ad8d31033d1fb6347f656b21d7e9efb650"

# Python — same inputs, same output, byte-identical
from map import GENESIS_HASH, compute_entry_hash, state_hash

null_hash = state_hash(None)
entry_hash = compute_entry_hash(
    sequence=0,
    action={"tool": "ping", "input": {}, "output": "pong"},
    state_before=null_hash, state_after=null_hash,
    parent_hash=GENESIS_HASH,
    critic={"verdict": "PASS", "reason": "ok"},
)
# → "25d29bc25a183ebdb29b70b6a03ed2ad8d31033d1fb6347f656b21d7e9efb650"

Six frozen v0.1 fixtures (in spec/fixtures/v0.1/) cover the cases that historically break cross-language hash protocols: unicode (NFC vs NFD), deep nesting, empty payloads, large payloads, integer-valued floats (RFC 8785 §3.2.2.3 — JS's JSON.stringify(1.0) is "1", Python's default json.dumps(1.0) is "1.0"; we bridge it). Both impls verify all six, and a ledger written in either language verifies in the other.

The Paved Path: Pre-Built Tool Packages

Instead of writing reversal schemas from scratch, use pre-built MAP-compliant tools. The first example ships in this repo at examples/tools-stripe — drop it directly into your project while the npm packages get published:

// Example pattern — see examples/tools-stripe in this repo for the full implementation
import { stripeTools } from './tools-stripe';
stripeTools.forEach(tool => map.addTool(tool));

Build tools with typed reversal strategies:

import { defineRestoreTool, defineCompensateTool, defineEscalateTool } from '@model-action-protocol/core';

// RESTORE: auto-capture state before write, restore on rollback
const updateCustomer = defineRestoreTool({
  name: 'updateCustomer', description: 'Update customer record',
  inputSchema: z.object({ id: z.string(), email: z.string() }),
  execute: async (input) => api.updateCustomer(input),
  capture: async (input) => api.getCustomer(input.id),
  restore: async (captured) => api.updateCustomer(captured),
});

// COMPENSATE: map forward action to compensating action
const chargeCard = defineCompensateTool({
  name: 'chargeCard', description: 'Charge a credit card',
  inputSchema: z.object({ amount: z.number() }),
  execute: async (input) => stripe.charges.create(input),
  compensate: async (input, output) => stripe.refunds.create({ charge: output.id }),
});

// ESCALATE: require human approval for irreversible actions
const wireTransfer = defineEscalateTool({
  name: 'wireTransfer', description: 'Send a wire transfer',
  inputSchema: z.object({ amount: z.number(), to: z.string() }),
  execute: async (input) => bank.sendWire(input),
  approver: 'treasury@company.com',
});

Planned tool packages:

@model-action-protocol/tools-stripe — payments, refunds, subscriptions (example included)
@model-action-protocol/tools-salesforce — CRM operations
@model-action-protocol/tools-netsuite — ERP/GL operations
@model-action-protocol/tools-hubspot — marketing automation
@model-action-protocol/tools-aws — infrastructure operations

Using an LLM Critic (Production)

import { MAP, createLLMCritic } from '@model-action-protocol/core';
import { generateText } from 'ai';

const critic = createLLMCritic({
  model: 'claude-haiku-4.5',
  generateText,
});

const map = new MAP(
  { executor: 'claude-sonnet-4.6', critic: 'claude-haiku-4.5' },
  critic
);

Learning Engine — The Ledger IS the Training Data

Every CORRECTED verdict, every FLAGGED action, every human Approve/Reject decision is permanently logged with full context. Over time, this becomes a dataset of "mistakes this organization's agents make" and "how humans want them corrected."

Level 1: Rule Extraction

After N identical corrections, the system proposes a new deterministic rule. No LLM needed for that check anymore — it becomes a microsecond gate.

import { LearningEngine } from '@model-action-protocol/core';

const engine = new LearningEngine();

// Analyze the ledger for repeated correction patterns
const patterns = engine.analyzePatterns(map.getLedger());
// → [{ tool: "reclassifyTransaction", count: 5, summary: "CORRECTED: SOX violation..." }]

// Propose rules from patterns observed 3+ times
const proposals = engine.proposeRules(map.getLedger(), 3);
// → [{ id: "rule_corrected:reclassify...", description: "Auto-proposed: ...", approved: false }]

// Human reviews and approves the rule
proposals.forEach(r => engine.addProposedRule(r));
engine.approveRule(proposals[0].id);

// Use learned rules as the fast tier in the tiered critic
const learnedCritic = engine.toRuleCritic();

// Plug into tiered critic — learned rules run first (microseconds),
// LLM only fires for patterns the rules haven't seen yet
import { createTieredCritic } from '@model-action-protocol/core';

const tieredCritic = createTieredCritic({
  low: learnedCritic,                                                  // μs — learned rules
  medium: createLLMCritic({ model: 'claude-haiku-4.5', generateText }), // 200ms
  high: createLLMCritic({ model: 'claude-sonnet-4.6', generateText }),  // 1-2s
});

The system gets cheaper and faster over time. Every correction that becomes a rule is one fewer LLM call.

Level 2: Critic Fine-Tuning

Export the corpus of corrections with human decisions as structured training data:

const trainingData = engine.exportFineTuningData(map.getLedger());
// → [{
//   input: { action, stateBefore, stateAfter },
//   output: { verdict: "CORRECTED", reason: "SOX violation...", correction: {...} },
//   humanApproval: "approved"
// }]

Fine-tune the Critic model on your organization's specific error patterns. The Critic doesn't just know general compliance — it knows YOUR compliance.

Level 3: Agent Self-Improvement

Give agents their own correction history so they stop repeating mistakes:

const memory = engine.exportAgentMemory(map.getLedger(), 'agent-compliance-checker');
// → [{
//   tool: "closeAccount",
//   whatHappened: "Called closeAccount with { accountId: '1200-004' }",
//   verdict: "FLAGGED",
//   lesson: "This action was FLAGGED and required human review: regulatory hold violation.
//            Do not attempt this without explicit approval."
// }]

// Inject into agent's system prompt as learned context
const agentPrompt = `
  You are a compliance agent. Here are lessons from your past actions:
  ${memory.map(m => `- ${m.lesson}`).join('\n')}
`;

Key design principle: The learning engine reads from the ledger, never modifies it. Proposed rules require human approval before activating. The human stays on the loop even for the learning system.

Data Privacy

All learning is local to your organization. A trust protocol cannot undermine trust.

Level	Where Data Lives	Shared Across Orgs?	Used for Base Model Training?
Rule extraction	Your environment	No	No
Critic fine-tuning	Your private fine-tuned model	No	No
Agent memory	Your agent's prompt context	No	No

Level 2 fine-tuning is explicitly opt-in — you export the data and fine-tune on your terms
Fine-tuned models are scoped to your organization — never cross-pollinated
MAP does not transmit, aggregate, or share learning data between organizations

Real-Time Events

map.on((event) => {
  switch (event.type) {
    case 'action:start':       // Before tool execution
    case 'action:complete':    // After execution + logging
    case 'critic:verdict':     // After critic review
    case 'correction:applied': // After auto-correction
    case 'flagged':            // Dangerous action detected
    case 'rollback:start':     // Before rollback
    case 'rollback:complete':  // After rollback
    case 'session:complete':   // Sequence finished
    case 'agent:spawned':      // New agent in the fleet
    case 'agent:terminated':   // Agent finished its work
    case 'authorization:granted': // KYA grant issued
    case 'authorization:revoked': // KYA grant revoked
    case 'error':              // Unrecoverable error
  }
});

API Reference

`new MAP(config, critic)`

Config	Type	Default	Description
`executor`	`string`	required	Model for the executor agent
`critic`	`string`	required	Model for the critic (cheap, fast)
`maxActions`	`number`	`50`	Max actions before force-stop
`autoCorrect`	`boolean`	`true`	Auto-apply CORRECTED fixes
`pauseOnFlag`	`boolean`	`true`	Halt execution on FLAGGED
`serializeState`	`fn`	`JSON.stringify`	Custom state serializer
`tags`	`string[]`	`[]`	AI Gateway cost attribution tags

Methods

Method	Description
`registerTool(name, desc, schema, fn)`	Register a tool with Zod schema
`addTool(tool)`	Register a pre-built MAPTool
`connectState(getState, setState)`	Connect to your environment
`execute(goal, tool, input)`	Execute one action with full provenance
`run(goal, actions[])`	Execute a sequence
`await rollbackTo(entryId)`	Revert to a specific point (async — calls tool.restore() for RESTORE tools)
`await rollbackToSafe()`	Revert to last known good state
`getLedger()`	Get all entries
`exportLedger()`	Export audit-ready JSON
`verifyIntegrity()`	Verify hash chain
`getStats()`	Session statistics
`on(handler)`	Subscribe to events

Ledger Entry Format

{
  id: string;                    // UUID
  sequence: number;               // Position in chain
  timestamp: string;              // ISO 8601
  action: {
    tool: string;
    input: Record<string, unknown>;
    output: unknown;
    reversalStrategy?: "COMPENSATE" | "RESTORE" | "ESCALATE";
  };
  stateBefore: string;            // SHA-256 hash
  stateAfter: string;             // SHA-256 hash
  snapshots: { before, after };   // Full serialized state
  parentHash: string;             // Previous entry's hash
  hash: string;                   // SHA-256 of this entry
  critic: {
    verdict: "PASS" | "CORRECTED" | "FLAGGED";
    reason: string;
    correction?: { tool, input };
    cost?: { inputTokens, outputTokens, model, latencyMs, costUsd };
  };
  status: "ACTIVE" | "ROLLED_BACK";
  approval?: "pending" | "approved" | "rejected";
  // Multi-agent provenance
  agentId?: string;               // Which agent acted
  parentEntryId?: string;         // Upstream agent's entry
  lineage?: string[];             // Root → current agent chain
  stateVersion?: number;          // Optimistic concurrency
}

Architecture

Human Supervisor (one person, many agents)
    │
    ▼
┌──────────────────────────────────────────────────┐
│               @model-action-protocol/core                 │
│                                                  │
│  ┌──────────┐  ┌────────┐  ┌────────────────┐   │
│  │ Executor │→ │ Critic │→ │    Ledger      │   │
│  │ Harness  │  │ (fast) │  │ (SHA-256 chain)│   │
│  └──────────┘  └────────┘  └────────────────┘   │
│       │                          │               │
│  ┌──────────┐  ┌────────────────┐│               │
│  │ Rollback │  │ KYA (Know Your ││               │
│  │ Engine   │  │    Agent)      ││               │
│  └──────────┘  └────────────────┘│               │
│                                  │               │
│  ┌──────────────────────────────┐│               │
│  │  Agent Lifecycle Tracking    ││               │
│  │  (spawn trees, ephemeral)    ││               │
│  └──────────────────────────────┘│               │
└──────────────────────────────────────────────────┘
    │              │              │
    ▼              ▼              ▼
  Agent A      Agent B       Agent C
 (Stripe)    (Salesforce)   (NetSuite)

Design Principles

Principle	How MAP Applies It
Messages as state	The ledger IS the execution state
Errors as feedback	Critic failures feed back, never crash
Schema-driven tools	Zod schemas validate before execution
Tiered model routing	Expensive execution + cheap critique
Append-only history	Rollback adds a revert entry, never deletes
Sub-agents as tool calls	Agent spawns tracked with full lineage

The MAP Protocol

MAP (Model Action Protocol) is an open standard for agent action provenance.

MCP standardized how agents use tools (inputs). MAP standardizes how agents prove what they did (outputs).

The strategy:

Open-source the protocol → every agent framework adopts it
Hand it to regulatory agencies → system of record for agent provenance
Commoditize the trust layer → build native zero-latency rollback into agent frameworks

Testing

# TypeScript (146 tests)
npm test
npm run test:fixtures                    # spec/fixtures/v0.1/ conformance
npm run test:python-output-conformance   # TS verifies Python-written ledgers

# Python (108 tests + 3 skipped on missing env vars)
cd python
pip install -e ".[dev]"
pytest                                   # everything except gated suites
pytest -m postgres                       # requires DB_HOST
pytest -m live_api                       # requires ANTHROPIC_API_KEY

254 tests across both implementations cover: ledger chaining and tamper detection, hash-chain verification, critic integration (PASS/CORRECTED/FLAGGED), auto-correction, RESTORE/COMPENSATE/ESCALATE reversal lifecycle, atomic stop-on-failure rollback semantics, audit export, event emission, sequence execution, tiered critic routing, custom risk classifiers, LearningEngine pattern fingerprints, tool builders / decorators, JCS canonicalization edge cases (unicode NFC vs NFD, integer-valued floats, deep nesting, empty/large payloads), cross-language conformance both directions, and SDK integration with mocked + live Anthropic clients.

Contributing

See CONTRIBUTING.md for guidelines on how to contribute to this project.

License

MIT — by deadpxl