xdotli
- Karma
- 14
- Created
- 2 years ago
About
Founder BenchFlow.ai, a benchmark company.Recent Submissions
- 1. ▲ Chaos of Agent (agentsofchaos.baulab.info)
- 2. ▲ Native CLI scaffolds consistently outper-form OpenCode when using the same model (arxiv.org)
- 3. ▲ We compare model quality in Cursor (cursor.com)
- 4. ▲ Automatically Learning Skills for Coding Agents (gepa-ai.github.io)
- 5. ▲ We Reached 74.8% on terminal-bench with Terminus-KIRA (krafton-ai.github.io)
- 6. ▲ Self-generated skills don't do much for AI agents, but human-curated skills do (theregister.com)
- 7. ▲ First Agent Skills Hackathon by the Authors of SkillsBench (skillathon.ai)
- 8. ▲ The First Agent Skills Benchmark (huggingface.co)
- 9. ▲ GPT-5.2 got worse on Terminal Bench 2.0, so is GPT-5.2 Pro (twitter.com)
- 10. ▲ Claude Skills as a Meta Tool (leehanchung.github.io)