Show HN: Beelzebub (OSS) – MCP "canary tools" for AI agents

8 points by mariocandela 9 months ago · 1 comment · 2 min read

We’re open-sourcing a simple way to add “canary tools” to AI agents via MCP honeypots. These are functions your agent should never call during normal operation. If a canary is invoked, you get a high-fidelity signal of prompt-injection, tool hijacking, or lateralization—no heuristics, no extra model calls.

What it is:

- Go framework exposing decoy tools over MCP that look legitimate (names/params/descriptions), return safe dummy output, and emit telemetry when invoked.

- Runs alongside your real tools; ship events to stdout/webhook or your pipeline (Prometheus/Grafana, ELK).

Why it helps:

Agent logs show what happened; canaries mark what must not happen. A single tripwire is an immediate, low-noise indicator of compromise.

Real-world relevance (Nx attack):

Recent reporting on the Nx npm supply-chain incident (“s1ngularity”) shows malicious versions exfiltrated SSH keys, tokens, and other secrets—and notably abused AI developer tools like Claude/Gemini in the workflow, one of the first documented cases of AI assistants being weaponized in a software supply-chain attack. If your IDE agent (Claude Code or Gemini Code/CLI) had a canary tool registered—e.g., a fake “export secrets” or “repo exfil” action—any unauthorized tool call from the agent side would have triggered a deterministic alert during that incident.

Links:

GitHub: https://github.com/mariocandela/beelzebub

Blog: https://beelzebub-honeypot.com/blog/securing-ai-agents-with-...

Feedback wanted! :)

mutant 9 months ago

Bloglink is 404

Settings

Show HN: Beelzebub (OSS) – MCP "canary tools" for AI agents

Keyboard Shortcuts