HydraMCP
Connect agents to agents.
An MCP server that lets Claude Code query any LLM — compare, vote, and synthesize across GPT, Gemini, Claude, and local models from one terminal.
Quick Start
That's it. The wizard walks you through everything — API keys, subscriptions, local models. At the end it gives you the one-liner to add to Claude Code.
Or if you already have API keys:
claude mcp add hydramcp -e OPENAI_API_KEY=sk-... -- npx hydramcp
What It Looks Like
Four models, four ecosystems, one prompt. Real output from a live session:
> compare gpt-5-codex, gemini-3, claude-sonnet, and local qwen on this function review
## Model Comparison (4 models, 11637ms total)
| Model | Latency | Tokens |
|----------------------------|-----------------|--------|
| gpt-5-codex | 1630ms fastest | 194 |
| gemini-3-pro-preview | 11636ms | 1235 |
| claude-sonnet-4-5-20250929 | 3010ms | 202 |
| ollama/qwen2.5-coder:14b | 8407ms | 187 |
All four independently found the same async bug. Then each one caught something different the others missed.
And this is consensus with a local judge:
> get consensus from gpt-5, gemini-3, and claude-sonnet. use local qwen as judge.
## Consensus: REACHED
Strategy: majority (needed 2/3)
Agreement: 3/3 models (100%)
Judge: ollama/qwen2.5-coder:14b (686ms)
Three cloud models polled, local model judging them. 686ms to evaluate agreement.
Tools
| Tool | What It Does |
|---|---|
| list_models | See what's available across all providers |
| ask_model | Query any model, optional response distillation |
| compare_models | Same prompt to 2-5 models in parallel |
| consensus | Poll 3-7 models, LLM-as-judge evaluates agreement |
| synthesize | Combine best ideas from multiple models into one answer |
| analyze_file | Offload file analysis to a worker model |
| smart_read | Extract specific code sections without reading the whole file |
| session_recap | Restore context from previous Claude Code sessions |
From inside Claude Code, just say things like:
- "ask gpt-5 to review this function"
- "compare gemini and claude on this approach"
- "get consensus from 3 models on whether this is thread safe"
- "synthesize responses from all models on how to design this API"
How It Works
Claude Code
|
HydraMCP (MCP Server)
|
SmartProvider (circuit breaker, cache, metrics)
|
MultiProvider (routes to the right backend)
|
|-- OpenAI -> api.openai.com (API key)
|-- Google -> Gemini API (API key)
|-- Anthropic -> api.anthropic.com (API key)
|-- Sub -> CLI tools (Gemini CLI, Claude Code, Codex CLI)
|-- Ollama -> local models (your hardware)
Three Ways to Connect Models
API Keys (fastest setup)
Set environment variables. HydraMCP auto-detects them.
| Variable | Provider |
|---|---|
OPENAI_API_KEY |
OpenAI (GPT-4o, GPT-5, o3, etc.) |
GOOGLE_API_KEY |
Google Gemini (2.5 Flash, Pro, etc.) |
ANTHROPIC_API_KEY |
Anthropic Claude (Opus, Sonnet, Haiku) |
Subscriptions (use your monthly plan)
Already paying for ChatGPT Plus, Claude Pro, or Gemini Advanced? HydraMCP wraps the CLI tools those subscriptions include. No API billing.
npx hydramcp setup # auto-installs CLIs and runs authThe setup wizard detects which CLIs you have, installs missing ones, and walks you through authentication. Each CLI authenticates via browser once — then it's stored forever.
| Subscription | CLI Tool | What You Get |
|---|---|---|
| Gemini Advanced | gemini |
Gemini 2.5 Flash, Pro, etc. |
| Claude Pro/Max | claude |
Claude Opus, Sonnet, Haiku |
| ChatGPT Plus/Pro | codex |
GPT-5, o3, Codex models |
Local Models
Install Ollama, pull a model, done. Auto-detected.
ollama pull qwen2.5-coder:14b
Mix and Match
All three methods stack. Use API keys for some providers, subscriptions for others, and Ollama for local. They all show up in list_models together.
Route explicitly with prefixes:
openai/gpt-5— force OpenAI APIgoogle/gemini-2.5-flash— force Google APIsub/gemini-2.5-flash— force subscription CLIollama/qwen2.5-coder:14b— force localgpt-5— auto-detect (tries each provider)
Setup Details
Option A: npx (recommended)
npx hydramcp setup # interactive wizard claude mcp add hydramcp -- npx hydramcp # register with Claude Code
Config is saved to ~/.hydramcp/.env and persists across npx runs.
Option B: Clone
git clone https://github.com/Pickle-Pixel/HydraMCP.git cd HydraMCP npm install && npm run build claude mcp add hydramcp -- node /path/to/HydraMCP/dist/index.js
Verify
Restart Claude Code and say "list models". You should see everything you configured.
Architecture
HydraMCP wraps all providers in a SmartProvider layer that adds:
- Circuit breaker — per-model failure tracking. After 3 failures, the model is disabled for 60s and auto-recovers.
- Response cache — SHA-256 keyed, 15-minute TTL. Identical queries are served instantly.
- Metrics — per-model query counts, latency, token usage, cache hit rates.
- Response distillation — set
max_response_tokenson any query and a cheap model compresses the response while preserving code, errors, and specifics.
Contributing
Want to add a provider? The interface is three methods:
interface Provider { healthCheck(): Promise<boolean>; listModels(): Promise<ModelInfo[]>; query(model: string, prompt: string, options?: QueryOptions): Promise<QueryResponse>; }
See src/providers/ollama.ts for a working example. Implement it, register in src/index.ts, done.
Providers we'd love to see: LM Studio, OpenRouter, Groq, Together AI, or anything that speaks HTTP.
License
MIT