A calm, powerful coding agent — runs anywhere, ships everywhere.
Daily driver CLI · AI SDK UI native · Remote sandbox in one command · Build your own Agent product
Quick Start · Features · Remote Sandbox · Build Your Own Agent · Docs
What is Bunny Agent?
Bunny Agent is a coding agent built on Pi Coding Agent — multi-model, harness-ready, and designed from the ground up for three jobs at once:
| Mode | What it means |
|---|---|
| 🖥️ Daily CLI agent | Install and use it like a local coding assistant, today |
| ☁️ Remote sandbox agent | bunny remote my-project — spin up a cloud machine for $5/mo |
| 🏗️ Your own Agent product | Next.js SaaS · Desktop app · Build your own OpenClaw alternative |
It outputs a native AI SDK UI stream — meaning you can wire it directly into any useChat() frontend with zero glue code.
✨ Features
🧠 Multi-Model, One CLI
Switch between Claude, Gemini, OpenAI, or any provider — no code changes required.
bunny run --runner pi --model google:gemini-2.5-pro -- "refactor this module" bunny run --runner claude --model claude-opus-4 -- "review my PR" bunny run --runner codex -- "fix the failing tests"
Powered by Pi Coding Agent — think of it as the oh-my-zsh of coding agents: pre-wired for every major provider, battle-tested on real engineering tasks.
🔧 Harness-Ready — Tools Included
No config needed. Bunny ships with a pre-built tool harness:
| Tool | What it does |
|---|---|
| 🔍 Web Search | Brave / Tavily, auto-detected from env keys |
| 🌐 Web Fetch | Full page content extraction |
| 🖼️ Image Generation | AI image creation from prompts |
| 🔨 Bash Execute | Run shell commands in the sandbox |
| 📁 File Ops | Read / write files in the workspace |
Add your own tools by dropping a skill file — the harness discovers them automatically.
🐰 Built with a Conscience
Every Bunny Agent ships with a core directive baked into its system prompt:
"Protect Human. Push Humanity Forward."
It's not decoration — it's the guiding principle behind every tool call, every decision, and every line of code Bunny writes.
📡 AI SDK UI Native — Zero Glue
Bunny's stdout is an AI SDK UI stream. Pipe it to your server, pass it to your client, done.
// Next.js API route — this is the entire backend export async function POST(req: Request) { const { messages, sessionId } = await req.json(); const agent = new BunnyAgent({ id: sessionId, sandbox: new SandockSandbox(), runner: { kind: "pi", model: "google:gemini-2.5-pro" }, }); return agent.stream({ messages }); // returns a Response with AI SDK UI stream }
// React client — useChat just works const { messages, input, handleSubmit } = useChat({ api: "/api/agent" });
No protocol translation. No buffering. Pure passthrough.
☁️ One-Command Remote Sandbox
Stop worrying about your laptop's specs. Launch a cloud machine instantly:
That's it. You're now in a remote machine backed by Sandock:
- ⚡ NVMe SSD — fast local I/O, not sluggish network storage
- 🗂️ POSIX-compliant filesystem — full compatibility, no quirks for coding agents
- 🔒 Isolated container, persistent volume across sessions
- 💰 Starting at $5 / month — production-grade at hobby prices
- ♾️ Launch as many sandboxes as you need — no local resource constraints
Sandock is purpose-built for coding agents: SSD-backed, POSIX-native, and optimised for the read/write patterns that agents generate. It's the best-performing sandbox at the lowest cost.
💾 Persistent Sessions
Every agent run is tied to an id. Resume exactly where you left off — same filesystem, same context.
bunny run --resume my-project -- "continue where we left off"🚀 Quick Start
Install
npm install -g @bunny-agent/runner-cli
Set your API key
export ANTHROPIC_API_KEY=sk-ant-... # or for Gemini: export GEMINI_API_KEY=...
Run your first task
# Local — uses your current directory bunny run -- "explain this codebase and suggest improvements" # Remote — cloud sandbox via Sandock bunny remote my-project # Choose a model bunny run --runner pi --model google:gemini-2.5-pro -- "write unit tests for src/auth.ts"
From source
git clone https://github.com/vikadata/sandagent.git cd bunny-agent pnpm install && pnpm build cd apps/runner-cli npx bunny-agent run -- "your task here"
☁️ One-Command Remote Sandbox
bunny remote <project-name>
Under the hood this:
- Provisions a Sandock container with NVMe storage
- Mounts a persistent volume for
<project-name> - Drops you into an interactive agent session on that machine
Run multiple sandboxes in parallel:
bunny remote frontend-work # machine 1 bunny remote backend-api # machine 2 bunny remote data-pipeline # machine 3
Each runs in full isolation. No config drift, no "works on my machine".
Get a Sandock API key at sandock.ai — plans start at $5/month.
🏗️ Build Your Own Agent Product
Use the SDK to embed Bunny Agent in any product — a Next.js SaaS, an Electron desktop app, or your own OpenClaw alternative. The architecture is the same either way: your UI talks to an AI SDK stream, Bunny handles the rest.
Architecture
Your Next.js App
│
├── useChat() ───────────────────────── React client (AI SDK)
│
└── POST /api/agent ────────────────── your API route
│
└── Bunny Agent.stream() ──────── Bunny Agent SDK
│
├── runner: pi / claude / codex / gemini
│
└── sandbox: Sandock / E2B / Daytona / Local
Sandbox options
| Sandbox | Best for | Setup |
|---|---|---|
| Sandock | ⭐ NVMe SSD · POSIX filesystem · coding-agent optimised · from $5/mo | API key from sandock.ai |
| E2B | Managed cloud sandboxes | API key from e2b.dev |
| Daytona | Enterprise / self-hosted | API key from daytona.io |
| Local | Development, no cloud needed | No key required |
Switch with one import — the rest of your code stays unchanged.
import { createBunnyAgent } from "@bunny-agent/sdk"; import { SandockSandbox } from "@bunny-agent/sandbox-sandock"; const agent = createBunnyAgent({ sandbox: new SandockSandbox(), runner: { kind: "pi", model: "anthropic:claude-sonnet-4" }, }); // Returns LanguageModelV3 — compatible with Vercel AI SDK const model = await agent.getModel();
🔧 CLI Reference
bunny run [options] -- "<task>"
Options:
-r, --runner <name> Runner: pi | claude | gemini | codex | opencode (default: claude)
-m, --model <model> Model override (e.g. google:gemini-2.5-pro)
-c, --cwd <path> Working directory (default: current dir)
-s, --system-prompt Custom system prompt
-t, --max-turns <n> Maximum turns
--resume <session> Resume a previous session
--yolo Skip confirmation prompts
-h, --help Show help
Environment Variables
| Variable | Description |
|---|---|
ANTHROPIC_API_KEY |
Claude / Anthropic models |
GEMINI_API_KEY |
Google Gemini models |
OPENAI_API_KEY |
OpenAI models |
SANDOCK_API_KEY |
Sandock remote sandbox |
E2B_API_KEY |
E2B cloud sandbox |
BRAVE_API_KEY |
Brave web search |
TAVILY_API_KEY |
Tavily web search (fallback) |
📦 Packages
| Package | Description |
|---|---|
@bunny-agent/sdk |
Embed Bunny in your app |
@bunny-agent/runner-harness |
Pre-built tool harness (search, bash, files, image gen) |
@bunny-agent/runner-pi |
Pi coding agent runner (multi-model) |
@bunny-agent/runner-claude |
Claude Agent SDK runner |
@bunny-agent/runner-codex |
OpenAI Codex runner |
@bunny-agent/runner-gemini |
Gemini CLI runner |
@bunny-agent/sandbox-sandock |
Sandock sandbox adapter |
@bunny-agent/sandbox-e2b |
E2B sandbox adapter |
@bunny-agent/sandbox-daytona |
Daytona sandbox adapter |
@bunny-agent/sandbox-local |
Local sandbox adapter |
📚 Documentation
- Quick Start Guide
- Architecture
- Technical Specification
- Sandbox Adapters
- Persistence Guide
- API Reference
- Debugging Guide
📊 Benchmark Results
Bunny Agent is evaluated on the GAIA benchmark — a challenging real-world task benchmark designed for general AI assistants.
Model: Gemini 3.1 Pro (via OpenAI-compatible API)
| Level | Tasks | Score | Pass Rate |
|---|---|---|---|
| L1 (simple reasoning) | 42 | 34/42 | 81% |
| L2 (multi-step) | 66 | 55/66 | 83% |
| L3 (complex reasoning) | 19 | 13/19 | 68% |
Results are legitimate zero-shot runs (no answer-revealing hints). Scores significantly exceed typical zero-shot baselines (~50–60% L2, ~11–30% L3).
Benchmarks are run using apps/bunny-bench — the integrated evaluation harness that ships with this repo. Wrong-answer tracking lets you iterate on failures without re-running solved tasks.
🤝 Contributing
PRs welcome. See CONTRIBUTING.md for guidelines.
pnpm install # install all workspace dependencies pnpm build # build all packages pnpm test # run tests pnpm typecheck # type-check everything
License
Apache 2.0 — see LICENSE.
