Show HN: AgentDX – Open-source linter and LLM benchmark for MCP servers
github.comMCP servers are proliferating fast, but most have vague tool descriptions and incomplete schemas that make LLMs pick the wrong tool or fill parameters incorrectly.
AgentDX is a CLI that measures this. Two commands:
- `npx agentdx lint` — static analysis of tool descriptions, schemas, and naming. 18 rules, zero config, no API key. Produces a lint score.
- `npx agentdx bench` — sends your tool definitions to an LLM (Anthropic, OpenAI, or Ollama) and evaluates tool selection accuracy, parameter correctness, ambiguity handling, multi-tool orchestration, and error recovery. Produces an Agent DX Score (0-100).
It auto-detects the server entry point, spawns it, connects as an MCP client, and reads tools via the protocol. Bench auto-generates test scenarios from your tool definitions.
Built in TypeScript, MIT licensed. Early alpha — the bench command works but is slow (sequential LLM calls, parallelization is next). Feedback welcome.
No comments yet.