SETI@Home, but for AI. A peer-to-peer compute grid for your containerized agents.
Zero-config P2P networking. Hardware-aware routing. OpenAI-compatible API. Live artifact streaming.
One-Line Install (macOS & Linux)
curl -fsSL https://api.agentfm.net/install.sh | bashWhat is AgentFM
A peer-to-peer compute grid that turns idle hardware into a decentralized AI supercomputer. Package your agent as a Podman container, advertise it on a libp2p mesh, and any client (your Next.js app, a LangChain script, a curl one-liner) can dispatch tasks over an end-to-end encrypted tunnel. No cloud accounts, no API keys, no data egress.
Three roles: a Worker runs your agent in a Podman sandbox; a Boss orchestrates and dispatches tasks (TUI or HTTP gateway); a Relay helps peers discover each other and punch through NAT. All you need to start is a laptop with Podman.
Three things make it interesting:
- OpenAI-compatible — point any OpenAI SDK at your local mesh and it just works.
- Hardware-aware — workers broadcast live CPU / GPU / queue state; the matcher picks the least-loaded peer for every request.
- Reputation-driven trust mesh (v1.3.1) — every rating is a signed receipt on a tamper-evident Merkle log; bosses earn worker reputation through hourly aggregate outcomes; equivocators are caught by witnesses and floored at
-1.0permanently; bad actors auto-reject below-0.5honesty. No allow-lists, no central authority, no blockchain. See Trust & Verification.
Hello World
Boot a worker that runs a local Llama 3.2 model, then dispatch tasks to it.
# 1. Prereqs (macOS shown; apt for Ubuntu) brew install podman && podman machine init && podman machine start curl -fsSL https://ollama.com/install.sh | sh ollama run llama3.2 # 2. Clone and start a worker git clone https://github.com/Agent-FM/agentfm-core.git && cd agentfm-core agentfm -mode worker \ -agentdir "./agent-example/sick-leave-generator/agent" \ -image "agentfm-sick-leave:v1" \ -model "llama3.2" -agent "HR Specialist" \ -maxtasks 10 -maxcpu 60 -maxgpu 70 # 3. In another terminal, start the API gateway and hit it agentfm -mode api -apiport 8080 & curl http://127.0.0.1:8080/v1/chat/completions -H 'Content-Type: application/json' \ -d '{"model":"llama3.2","messages":[{"role":"user","content":"Draft a sick-leave email"}]}'
That's it. Files the agent drops into /tmp/output get zipped and shipped back to ./agentfm_artifacts/<task_id>.zip.
Want the interactive radar? Skip step 3 and run
agentfm -mode bossfor the live TUI.
Features
- OpenAI-compatible API on
/v1/chat/completions,/v1/completions,/v1/models. Drop-in for LangChain, LlamaIndex, LiteLLM, Continue, Open WebUI, the official OpenAI SDKs, anything. - Hardware-aware routing. Workers broadcast live CPU / GPU / RAM / queue every 2s. The matcher picks the least-loaded peer per request. No central scheduler.
- End-to-end encrypted P2P. libp2p Noise streams between Boss and Worker. The Relay sees discovery metadata only; never prompt content.
- Bearer-token auth.
AGENTFM_API_KEYSenables per-request bearer validation, constant-time comparison, per-IP rate limiting on failed attempts. Refuses to start with public bind + no keys. - Public mesh, private swarms, or solo-dev. Same binary. Toggle PSK mode for fully isolated darknet meshes invisible to the public network.
- Witness replicas.
agentfm -mode witnessruns a ledger-only daemon that persists every gossiped rating + comment to disk and serves them via/agentfm/inbox-fetch/1.0.0. A new boss joining a mesh where every boss has gone offline still recovers the ledger from any witness it can dial. No API, no podman; cheap to run alongside a relay on a VPS. - Container sandboxing. Every task runs in a fresh Podman container. SIGKILL'd the instant the stream dies. Resource budgets stop a noisy task from hurting its operator.
- Live artifact streaming. Anything an agent writes to
/tmp/outputis auto-zipped, transferred, and extracted client-side. Zip-slip + zip-bomb defense baked in. - Observability built in. Prometheus metrics on every node (
/metrics), structured slog JSON logs ready for Loki / ELK / Datadog,/healthendpoint for load balancers. - Async + webhook callbacks. Fire-and-forget submission with HMAC-signed webhook delivery on completion. SSRF-guarded against private network attacks.
- Cross-platform. Single statically-linked binary for Linux, macOS, Windows, FreeBSD across amd64, arm64, armv7, 386, and RISC-V.
Join the Public Mesh in 30 seconds (v1.3.1)
The public mesh has no allow-list. Push your agent image anywhere, point a worker at the public lighthouse, and you're in. Reputation accumulates from honest behaviour over time.
# 1. Build + push your image to any registry. podman build -t ghcr.io/yourorg/myagent:v1 ./my-agent podman push ghcr.io/yourorg/myagent:v1 # 2. Run a worker. It joins the mesh immediately — # no PR, no maintainer review, no allow-list. agentfm -mode worker \ -agentdir ./my-agent \ -image ghcr.io/yourorg/myagent:v1 \ -agent "My Agent" \ -capability "research-assistant" \ -model "llama3.2" # 3. Watch your reputation accumulate via the boss-side TUI: # arrow to your worker → ENTER → "View ratings & feedback" agentfm -mode boss
v1.3.1 uses reputation-driven trust by default. Operators can tighten the dispatch gate via --reputation-floor=-0.3 (stricter than the default -0.5) or effectively disable it via --reputation-floor=-1.0. No allow-list file, no maintainer review required. See Trust & Verification for the full model.
Want full network isolation? That's the private-swarm path — same binary, --swarmkey plus your own --genesis-seeds files.
Python SDK
Typed sync and async clients with full OpenAI-compatible namespace, scatter-gather batch dispatch, signed webhook callbacks, and strict mypy compliance.
from agentfm import AgentFMClient with AgentFMClient(gateway_url="http://127.0.0.1:8080") as client: # workers.list(model=...) is a discovery FILTER: "show me workers # whose advertised engine string equals 'llama3.2'" workers = client.workers.list(model="llama3.2", available_only=True) # tasks.run dispatches to a specific machine by its cryptographic peer_id result = client.tasks.run(worker_id=workers[0].peer_id, prompt="Draft a leave policy.") print(result.text) print(result.artifacts) # list[Path] auto-extracted
A note on the word "model"
The SDK uses model for two different things:
workers.list(model=...)is a discovery filter — exact-match against what each worker advertised at startup (-model llama3.2).openai.chat.completions.create(model=...)is a routing identifier — the gateway accepts three kinds of values here, matched in priority order: apeer_id(most specific, cryptographically verifiable), an agent name, or an engine name.
Both forms work for the OpenAI namespace:
# Option 1: OpenAI-native shape — route to ANY worker advertising "llama3.2". # Familiar if you're coming from the cloud OpenAI SDK. resp = client.openai.chat.completions.create( model="llama3.2", messages=[{"role": "user", "content": "hi"}], ) # Option 2: pin to a SPECIFIC machine by peer_id. # Recommended for production: peer_id is the only cryptographically # verifiable identifier, so you know exactly which worker served the request. resp = client.openai.chat.completions.create( model=workers[0].peer_id, messages=[{"role": "user", "content": "hi"}], )
Both calls work; pick the one that matches your trust model. In a federated mesh, anyone can advertise themselves as running llama3.2, so production code that cares about provenance should pin by peer_id.
What actually happens for each form
The two forms behave very differently when worker state changes. Engine-name routing trades guaranteed placement for automatic fallback. PeerID pinning trades fallback for guaranteed placement.
| Worker state | model="12D3KooW..." (pin) |
model="llama3.2" (engine) |
|---|---|---|
| Worker online with capacity | 200 — served by that exact machine | 200 — served by least-loaded llama3.2 worker |
Worker online but at max_tasks |
503 mesh_overloaded (no fallback) |
200 — served by another llama3.2 worker |
| Worker offline / wrong peer_id | 404 model_not_found |
200 — any other llama3.2 worker |
| All matching workers at capacity | n/a | 503 mesh_overloaded |
For engine-name routing, the gateway picks the worker with the lowest current_tasks/max_tasks ratio (CPU usage as tiebreaker). The OpenAI response doesn't tell you which peer served the request — if you need that, list-and-pick yourself, or use the AgentFM-native client.tasks.run() which returns result.worker_id.
Pick by peer_id when you need provenance, reproducibility, or a specific machine (custom weights, fine-tunes, hardware benchmarking). Use engine-name routing when all matching workers are equivalent and trusted, and you want automatic load balancing and failover.
Async mirror via AsyncAgentFMClient. Full SDK docs: agentfm-python/README.md. PyPI: pypi.org/project/agentfm-sdk.
Documentation
| Topic | Doc |
|---|---|
| Get the binaries | Installation |
| Run an agent on the mesh | Run a Worker |
| Use OpenAI SDKs against your mesh | OpenAI-Compatible API |
| Lock down off-host gateways | Authentication |
| Raw HTTP for non-Python clients | Raw HTTP API |
| Typed Python client | Python SDK |
| Prometheus + structured logs | Observability |
| Stand up a private darknet mesh | Private Swarms |
| Wire protocols + system topology | Architecture |
| Threat model + hardening checklist | Security Model |
| Every flag, every env var | CLI Reference |
| Build from source, run tests | Development |
| Branching + PR conventions | Contributing |
Built with Go, libp2p, and a belief that compute should belong to everyone.
