Settings

Theme

Agenteval.org: An Open-Source Benchmarking Initiative for AI Agent Evaluation

scorecard.io

6 points by Rutledge 10 months ago · 1 comment

Reader

RutledgeOP 10 months ago

This initiative is designed to be community-driven, so we're looking forward to your feedback on what agent benchmarking needs exist in your domains. While starting with legal AI, we plan to expand across industries where benchmarks for AI agents evaluation are needed.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection