Five Agents, One Browser: Werewolf on Quack + DuckDB

9 min read Original article ↗

In this post I'll show a working demo of multi-agent information asymmetry enforced at the database layer. A small fleet of language models plays Werewolf in your browser tab. Each agent holds a private DuckDB-WASM database inside its own Web Worker. A gateway worker is the only thing that can read across agents, and only with the right token.

In local-model mode, nothing leaves the tab and there is no server-side inference. If you paste an Anthropic, OpenAI, or OpenRouter key, the browser sends prompts directly to that provider with your key. Either way, the page loads, agents start plotting, and you can watch the federation log fill up with the SQL the gateway sent.

The architectural claim is one sentence: information asymmetry can be enforced by the schema, not by the application code. Same query, different result, depending on what the database knows about itself.

Agents play Werewolf

Pick a setup and a model, then hit play. The local option runs Qwen3.5-2B-q4f16_1-MLC on your WebGPU. That is the 2B-parameter Qwen3.5 model at 4-bit weight quantization with 16-bit activations, about 1.2 GB on disk and cached by IndexedDB after the first load. If you have an Anthropic, OpenAI, or OpenRouter key, paste it and the agents use a hosted model instead.

On phones: the in-browser model is desktop-only in this demo. The 1.2 GB WebGPU model reliably trips mobile-tab memory budgets and reloads the page mid-load, so the local option is disabled there. DuckDB-WASM, the gateway, and the per-player workers all run fine on mobile, pick a hosted-API option on a phone and the rest of the architecture still works end to end.

The scopes

Scope What it can do
own-db-full A player worker on its own DB. Any table, any column, read and write.
gateway-public-only Gateway pulling public-safe columns during play. Can read intents.public_text. Cannot read intents.rationale.
wolf-team-read Gateway pulling wolf channel data. Authorized at the gateway, then row-filtered locally. Non-wolf workers return zero rows.
gateway-post-game Gateway pulling everything after the game finishes.
main-admin Orchestrator/bootstrap path for seeding roles, gateway state, and action rows.

When the gateway federates a query, it mints a fresh per-node token, sends the same SQL fragment to every attached player in parallel, ingests the returned rows into a local temp table, and runs a final SQL against that temp table. Open the Federation log panel during or after a game and you'll see every call, the per-node SQL it sent, the temp-table DDL, the gateway-local final SQL, the row count, and the elapsed time.

The boundary lives in the columns

Every agent turn writes one row to that agent's intents table:

INSERT INTO intents (round, agent_id, action, target, rationale, public_text)
VALUES (...);

rationale is the model's private reasoning. public_text is what the agent says out loud, or NULL for a night action. Both come from the same LLM call. The model produces them together in one JSON object.

During play, the gateway pulls only public_text. Its token does not permit the rationale column. Each player worker parses the incoming SELECT and rejects any reference to a column the scope does not allow, including references in filters and ordering clauses. A gateway-public-only token asking for rationale, or trying to filter by it, gets back QK_COLUMN_FORBIDDEN. The Try denied scope button after a game finishes will fire that denial and log it in the federation panel.

After the game ends, the orchestrator mints a gateway-post-game token. The same federation pattern returns the same row set with rationale included. The X-ray panel renders both columns side by side.

The boundary is in the column, and that is the entire pattern.

The wolf channel: row-level federation

Column scoping is one axis. Row-level filtering is the other. The Two wolves preset exercises it.

In a Werewolf game with multiple wolves, the wolves coordinate. The demo handles that with a multi-rotation channel. Each rotation, every wolf gets a turn. The gateway re-federates the channel, the latest state goes into each wolf's prompt as channel so far this round, and the rotation ends when every wolf signals done. Day discussion works the same way. Every alive player gets multiple chances to speak. The discussion ends when nobody has anything new to add.

Before each wolf turn the orchestrator mints a wolf-team-read token and federates a named query:

SELECT round, agent_id, action, target, rationale,
       CAST(decided_at AS VARCHAR) AS decided_at
FROM intents
WHERE action = 'wolf-kill'
ORDER BY round, decided_at;

The gateway fans this out to every attached player, including the seer, the doctor, and every villager. The row-level filter is declared on the wolf-team-read scope in PLAYER_POLICY as a SQL predicate:

"wolf-team-read": {
  tables: ["intents", "self"],
  columns: { intents: [...], self: ["agent_id", "role"] },
  statements: ["SELECT"],
  rowPredicate: "SELECT (SELECT role FROM self LIMIT 1) = 'wolf' AS ok",
}

Before running the requested SELECT, the worker evaluates policy.rowPredicate against its own database. If the first column of the first row is false, it returns zero rows without ever running the caller's SQL. A non-wolf worker accepts the request, evaluates the predicate against its self table, gets false, and returns {rows: []}. Wolves get true and return their proposals. The gateway ingests them, runs the final query, and hands the transcript back to the orchestrator, which threads it into the next wolf's prompt.

The gateway does not know which workers are wolves. The schema does. Same query, different result, decided locally by each player based on a column it owns and the caller cannot see.

In the federation log you can see every wolf_channel call the gateway made during play. The row count grows over time because every wolf proposal writes a new row. By the end of a Two wolves game over three rounds, the wolf channel federated about 18 times. The post-game full_log query returned every intent every player wrote across the whole game.

The real Quack path

Quack is DuckDB's remote protocol. One DuckDB process can serve a database with quack_serve(...), and another DuckDB process can query it with quack_query(...) or attach it as a remote catalog. The DuckDB reference documents the server functions, client functions, ATTACH options, and Quack logs. The security guide matters even more for this post because a Quack server exposes SQL, then relies on token authentication and authorization callbacks to decide what each caller may run.

The Quack use case here is database-owned agent federation. Each agent owns a DuckDB database. The gateway does not ask an application object for a filtered copy of memory. It asks a remote DuckDB node for a SQL result. That node decides, using its own tables and Quack authorization path, which rows and columns are allowed to leave.

The browser demo keeps that model portable by using a Quack-shaped postMessage shim for the server side. One browser tab can load the real Quack extension as a client with DuckDB-WASM, which DuckDB documents in its WASM setup guide, but one tab cannot host five native Quack server processes plus a gateway server. So each player worker behaves like a Quack node: it owns a DuckDB-WASM database, accepts request/response messages, authenticates a bearer token, authorizes SQL by scope, and returns rows.

Because the transport boundary is the interesting part, I also built a companion Werewolf Quack Lab. It starts native DuckDB Quack servers from a JSON player config and queries them from a gateway DuckDB client. The latest version also asks each player container to take one local agent action. In the default test mode the agent is deterministic, and in OpenAI-compatible mode the model call happens from inside the player container before the proposed action is validated and written to that player's DuckDB process. The lab treats model output as a proposal: phase actions are constrained, wolf targets are forced to valid non-partners, and wolf actions cannot publish text. That path can point at a hosted API or at a local oMLX server. The lab includes an optional omlx smoke test that checks /v1/models, handles API-key auth, generates a three-player config, asks the container-local agents to act through the local model, and then runs the same Quack gateway assertions.

The lab smoke test proves the shape end to end: generated players, container local action writes, whoami() calls across every node, public-log federation that hides rationale, wolf-channel federation that returns only the wolf nodes, a closed post-game view while the game is still live, and a real Quack authorization denial when the gateway tries to read the private intents table directly.

What this demo gets wrong on purpose

The same JS process owns the HMAC signing key, every player worker, and the gateway. A browser extension could read all of it. The boundary I'm demonstrating is architectural. The same architecture deployed across separate processes on separate hosts would be enforced by the OS, the network, and TLS.

The local model, Qwen3.5-2B-q4f16_1-MLC, sometimes gives up. When it does, the rationale row reads (LLM fallback) and the agent passes. The model dropdown also accepts an Anthropic, OpenAI, or OpenRouter key. Haiku 4.5, gpt-4o-mini, and Grok 4 play coherently. The architecture stays the same regardless of which model is driving the agents.

The API-key path keeps your key in the browser. Anthropic's own header name for this is anthropic-dangerous-direct-browser-access. The warning is real. It works for a bring-your-own-key demo against your own account. It is not a deployment pattern.

Why this matters for your multi-agent system

If you are building a product where multiple agents share an execution substrate and must not share information, the boundary has to live somewhere. There are three places it usually lives:

  • The application layer. Every read is a function call, every function call checks the caller.
  • The model layer. Separate fine-tunes, separate runtimes.
  • The data layer. Columns and rows of the tables the agent already queries, with signed bearer tokens that name what the holder is allowed to look at, and predicates the database evaluates against its own state.

In the data-layer version, the agent gets exactly the slice of data its scope allows. The forbidden column is absent from the result. The forbidden row is filtered before the connection sees it. The federation reached N nodes, and N minus k of them returned zero rows.

That pattern shows up in clinical assistants seeing different patients, customer-isolated copilots, on-device specialist fleets, skill marketplaces owned by different parties, and supervisor-subordinate agent chains where the supervisor should not see the subordinate's internal reasoning.

If you are hitting that wall, book a call and we can talk through your architecture.