Ivan Bercovich@neversupervisedArticleWhat Makes a Good Terminal Bench TaskMost people write benchmark tasks the way they write prompts. They shouldn't. A prompt is designed to help the agent succeed. A benchmark is designed to find out if it can. I've been a contributor and...8:35 PM · Mar 21, 202683KViews