Settings

Theme

OpenGameEval: Eval Framework to Benchmark Agentic AI Assistants

corp.roblox.com

7 points by moneil971 18 hours ago · 1 comment

Reader

moneil971OP 18 hours ago

OpenGameEval offers a unique testing ground to evaluate core model capabilities related to agenetic reasoning and long-horizon task solving.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection