Settings

Theme

xdotli

Karma
14
Created
2 years ago

About

Founder BenchFlow.ai, a benchmark company.

Recent Submissions

  1. 1. Chaos of Agent (agentsofchaos.baulab.info)
  2. 2. Native CLI scaffolds consistently outper-form OpenCode when using the same model (arxiv.org)
  3. 3. We compare model quality in Cursor (cursor.com)
  4. 4. Automatically Learning Skills for Coding Agents (gepa-ai.github.io)
  5. 5. We Reached 74.8% on terminal-bench with Terminus-KIRA (krafton-ai.github.io)
  6. 6. Self-generated skills don't do much for AI agents, but human-curated skills do (theregister.com)
  7. 7. First Agent Skills Hackathon by the Authors of SkillsBench (skillathon.ai)
  8. 8. The First Agent Skills Benchmark (huggingface.co)
  9. 9. GPT-5.2 got worse on Terminal Bench 2.0, so is GPT-5.2 Pro (twitter.com)
  10. 10. Claude Skills as a Meta Tool (leehanchung.github.io)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection