OpenGameEval: Eval Framework to Benchmark Agentic AI Assistants

7 points by moneil971 7 months ago · 1 comment

Reader

OpenGameEval offers a unique testing ground to evaluate core model capabilities related to agenetic reasoning and long-horizon task solving.

Settings