ACP - Agent Control Protocol

ACP

Agent Control Protocol

An open protocol for AI agents to operate existing application user interfaces.

The Missing Protocol

MCP	(Anthropic)	-->	LLM ↔ Data / Tools
A2A	(Google)	-->	Agent ↔ Agent
AG-UI	(CopilotKit)	-->	Agent → Frontend streaming
A2UI	(Google)	-->	Agent → Generated UI
ACP		-->	Agent ↔ Existing Application UI

Existing protocols let agents access data, coordinate with other agents, stream events to frontends, and generate new UI components. None of them allow an agent to operate an existing application's interface. ACP fills this gap.

Three Steps

Step 1

Describe

Your app sends a manifest describing its screens, fields, actions, and modals. This is the agent's map of the interface.

Step 2

Converse

The user sends natural language. The agent interprets intent using the manifest, knowing exactly what fields and actions are available.

Step 3

Execute

The agent sends commands -- set_field, click, navigate. The SDK executes them against the live UI and reports results back.

See It In Action

Three agents (Gemini, DeepSeek, Haiku) completing a full task in parallel using ACP.

1. Application sends manifest

{
  "type": "manifest",
  "app": "contact-portal",
  "currentScreen": "contact",
  "screens": {
    "contact": {
      "id": "contact",
      "label": "Contact Form",
      "fields": [
        { "id": "name", "type": "text", "label": "Full Name", "required": true },
        { "id": "email", "type": "email", "label": "Email", "required": true },
        { "id": "message", "type": "textarea", "label": "Message" }
      ],
      "actions": [
        { "id": "submit", "label": "Send Message" }
      ]
    }
  }
}

2. Agent responds with commands

{
  "type": "command",
  "seq": 1,
  "actions": [
    { "do": "set_field", "field": "name", "value": "Alice Park" },
    { "do": "set_field", "field": "email", "value": "[email protected]" },
    { "do": "set_field", "field": "message", "value": "Hello, I need help resetting my account." },
    { "do": "click", "action": "submit" }
  ]
}

3. SDK reports results

{
  "type": "result",
  "seq": 1,
  "results": [
    { "index": 0, "success": true },
    { "index": 1, "success": true },
    { "index": 2, "success": true },
    { "index": 3, "success": true }
  ]
}

Protocol At A Glance

8 UI Actions

navigate, set_field, clear, click, show_toast, ask_confirm, open_modal, close_modal

15 Field Types

text, number, currency, date, datetime, email, phone, masked, select, autocomplete, checkbox, radio, textarea, file, hidden

Manifest Structure

Screens, fields, actions, modals -- everything the agent needs to understand the application's UI and its current state.

Command-Result Loop

The agent sends commands with sequence IDs. The SDK reports success or failure per action, enabling reliable multi-step workflows.

Structured, Not Scraped

Vision Screenshot analysis is slow, expensive in tokens, and fragile across resolutions and themes. A single UI redesign breaks everything.
DOM Scraping Couples the agent to implementation details that change every deploy. Does not work on native mobile or desktop at all.
RPA Heavyweight, enterprise-only, batch-oriented. Not designed for real-time conversational interaction.
ACP The application declares its own structure. The agent operates with certainty, not heuristics. Works on any platform -- web, mobile, desktop -- because the SDK mediates between the protocol and the native UI layer.

Implementations

Implementation	Type	Platform	Status
Vocall Engine by Primoia	Server	Go	Production
vocall_sdk by Primoia	SDK	Flutter	Production
vocall-react by Primoia	SDK	React / Next.js	Production

The spec and conformance tests are all you need to build your own implementation. List yours here.