Show HN: ZetaCrush – An Intelligent LLM Leaderboard
zetacrush.comHi all, I wanted to share the leaderboard I have created and am working to rank LLM models. My results are very similar to those of ARC-AGI 2 with the only exception being that DeepSeek is rated higher on my leaderboard. In order to keep the test closed-source. The plan is that once the top models max out on a given task on our test then we will adopt new criteria to differentiate.
The test is currently comprised of 10 scores, 9 of which no model scores above 0 on. Check it out and let me know what you think! Thanks
No comments yet.