GitHub - fsilavong/agent-eval: Your own eval engineer

1 min read Original article ↗

Agent Eval is a skill for evaluating agentic AI pipeline systems at both the component level and end-to-end level. It helps you define what to measure, build or sample eval cases, run repeatable tests, track regressions over time, and turn results into grounded takeaways about what improved, what regressed, and what to change next.

What it offers

Agent Eval demo

Install

Manual

git clone https://github.com/fsilavong/agent-eval.git ~/.claude/skills/agent-eval

Install with the Vercel Skills CLI:

npx skills add fsilavong/agent-eval