GitHub - qix/sandclaw

3 min read Original article ↗

Sandclaw

A human-in-the-loop safety framework for AI agents


Sandclaw is a personal agent designed for safety first.

The core safety mechanism is a split into three three parts:

  • Muteworker — does the thinking, all the data, but cannot speak to the outside world.
  • Gatekeeper — the web UI and API that holds the keys to the outside world.
  • Confidante — trusted worker can run agents on specific trusted websites

While the Muteworker is exposed to content from the internet, prompt injections cannot expose any secrets or do any damage. The Muteworker is still able to produce emails, messagers, and request actions be taken on the internet... but each of those still needs to step through the Gatekeeper before they take affect.

The Confidante is an agent that is intended to run with secrets, but the scope of the public internet that it is exposed to is limited.

Getting Started

Warning

This is a very early release and work in progress, it should start up with the chat plugin, but some of the other plugins, especially the builder, will require tweaking on your end to get working.

This scaffolds a new project with all three components, prompts for your model provider, and installs dependencies.

Then:

cd your-project

# Add your API keys
edit .env

# Start all three services
npm start

By default the Gatekeeper UI will be available at http://localhost:3000.

Plugins

Sandclaw has a minimal core, and most of the logic is implemented by plugins. Each plugin can add UI panels, API routes, database tables, and agent tools. For personal use you should be generating your own local plugins that do exactly what you need.

Built-in plugins

Plugin Description
@sandclaw/memory-plugin Persistent agent memory
@sandclaw/prompts-plugin Handles system prompts
@sandclaw/chat-plugin Chat interface for talking to your agent
@sandclaw/web-search-plugin Web search via Brave API
@sandclaw/browser-plugin Browser automation (runs on the Confidante)
@sandclaw/github-plugin Create PRs, download code, manage repos
@sandclaw/gmail-plugin Read and send Gmail messages
@sandclaw/whatsapp-plugin Send and receive WhatsApp messages
@sandclaw/telegram-plugin Telegram bot integration
@sandclaw/obsidian-plugin Search, read, and write Obsidian vault notes
@sandclaw/google-maps-plugin Google Maps lookups
@sandclaw/builder-plugin Code generation for your sandclaw

Plugins are configured in plugins.ts. Enable or disable them by adding or removing them from the array.

Project Structure

your-project/
├── gatekeeper.ts     # Web UI + approval API
├── muteworker.ts      # Safe agent (no internet)
├── confidante.ts      # Dangerous agent (has credentials)
├── config.ts          # Ports, model provider, paths, plugins
├── prompts/           # Agent personality and instructions
│   ├── IDENTITY.md
│   ├── SOUL.md
│   ├── SYSTEM.md
│   └── USER.md
├── plugins/           # Custom plugins for your agent
├── memory/            # Persistent agent memory
└── data/              # SQLite database

License

MIT