GitHub - Pickle-Pixel/HydraMCP: Connect agents to agents. MCP server for querying any LLM through your existing subscriptions: compare, vote, and synthesize across GPT, Gemini, Claude, and local models from one terminal.

HydraMCP

Connect agents to agents.

An MCP server that lets Claude Code query any LLM — compare, vote, and synthesize across GPT, Gemini, Claude, and local models from one terminal.

Quick Start

That's it. The wizard walks you through everything — API keys, subscriptions, local models. At the end it gives you the one-liner to add to Claude Code.

Or if you already have API keys:

claude mcp add hydramcp -e OPENAI_API_KEY=sk-... -- npx hydramcp

What It Looks Like

Four models, four ecosystems, one prompt. Real output from a live session:

> compare gpt-5-codex, gemini-3, claude-sonnet, and local qwen on this function review

## Model Comparison (4 models, 11637ms total)

| Model                      | Latency         | Tokens |
|----------------------------|-----------------|--------|
| gpt-5-codex                | 1630ms fastest  | 194    |
| gemini-3-pro-preview       | 11636ms         | 1235   |
| claude-sonnet-4-5-20250929 | 3010ms          | 202    |
| ollama/qwen2.5-coder:14b   | 8407ms          | 187    |

All four independently found the same async bug. Then each one caught something different the others missed.

And this is consensus with a local judge:

> get consensus from gpt-5, gemini-3, and claude-sonnet. use local qwen as judge.

## Consensus: REACHED

Strategy: majority (needed 2/3)
Agreement: 3/3 models (100%)
Judge: ollama/qwen2.5-coder:14b (686ms)

Three cloud models polled, local model judging them. 686ms to evaluate agreement.

Tools

Tool	What It Does
list_models	See what's available across all providers
ask_model	Query any model, optional response distillation
compare_models	Same prompt to 2-5 models in parallel
consensus	Poll 3-7 models, LLM-as-judge evaluates agreement
synthesize	Combine best ideas from multiple models into one answer
analyze_file	Offload file analysis to a worker model
smart_read	Extract specific code sections without reading the whole file
session_recap	Restore context from previous Claude Code sessions

From inside Claude Code, just say things like:

"ask gpt-5 to review this function"
"compare gemini and claude on this approach"
"get consensus from 3 models on whether this is thread safe"
"synthesize responses from all models on how to design this API"

How It Works

Claude Code
    |
    HydraMCP (MCP Server)
    |
    SmartProvider (circuit breaker, cache, metrics)
    |
    MultiProvider (routes to the right backend)
    |
    |-- OpenAI     -> api.openai.com (API key)
    |-- Google     -> Gemini API (API key)
    |-- Anthropic  -> api.anthropic.com (API key)
    |-- Sub        -> CLI tools (Gemini CLI, Claude Code, Codex CLI)
    |-- Ollama     -> local models (your hardware)

Three Ways to Connect Models

API Keys (fastest setup)

Set environment variables. HydraMCP auto-detects them.

Variable	Provider
`OPENAI_API_KEY`	OpenAI (GPT-4o, GPT-5, o3, etc.)
`GOOGLE_API_KEY`	Google Gemini (2.5 Flash, Pro, etc.)
`ANTHROPIC_API_KEY`	Anthropic Claude (Opus, Sonnet, Haiku)

Subscriptions (use your monthly plan)

Already paying for ChatGPT Plus, Claude Pro, or Gemini Advanced? HydraMCP wraps the CLI tools those subscriptions include. No API billing.

npx hydramcp setup   # auto-installs CLIs and runs auth

The setup wizard detects which CLIs you have, installs missing ones, and walks you through authentication. Each CLI authenticates via browser once — then it's stored forever.

Subscription	CLI Tool	What You Get
Gemini Advanced	`gemini`	Gemini 2.5 Flash, Pro, etc.
Claude Pro/Max	`claude`	Claude Opus, Sonnet, Haiku
ChatGPT Plus/Pro	`codex`	GPT-5, o3, Codex models

Local Models

Install Ollama, pull a model, done. Auto-detected.

ollama pull qwen2.5-coder:14b

Mix and Match

All three methods stack. Use API keys for some providers, subscriptions for others, and Ollama for local. They all show up in list_models together.

Route explicitly with prefixes:

openai/gpt-5 — force OpenAI API
google/gemini-2.5-flash — force Google API
sub/gemini-2.5-flash — force subscription CLI
ollama/qwen2.5-coder:14b — force local
gpt-5 — auto-detect (tries each provider)

Setup Details

Option A: npx (recommended)

npx hydramcp setup                           # interactive wizard
claude mcp add hydramcp -- npx hydramcp      # register with Claude Code

Config is saved to ~/.hydramcp/.env and persists across npx runs.

Option B: Clone

git clone https://github.com/Pickle-Pixel/HydraMCP.git
cd HydraMCP
npm install && npm run build
claude mcp add hydramcp -- node /path/to/HydraMCP/dist/index.js

Verify

Restart Claude Code and say "list models". You should see everything you configured.

Architecture

HydraMCP wraps all providers in a SmartProvider layer that adds:

Circuit breaker — per-model failure tracking. After 3 failures, the model is disabled for 60s and auto-recovers.
Response cache — SHA-256 keyed, 15-minute TTL. Identical queries are served instantly.
Metrics — per-model query counts, latency, token usage, cache hit rates.
Response distillation — set max_response_tokens on any query and a cheap model compresses the response while preserving code, errors, and specifics.

Contributing

Want to add a provider? The interface is three methods:

interface Provider {
  healthCheck(): Promise<boolean>;
  listModels(): Promise<ModelInfo[]>;
  query(model: string, prompt: string, options?: QueryOptions): Promise<QueryResponse>;
}

See src/providers/ollama.ts for a working example. Implement it, register in src/index.ts, done.

Providers we'd love to see: LM Studio, OpenRouter, Groq, Together AI, or anything that speaks HTTP.

License

MIT