Settings

Theme

SWE-bench will hit 90% this year

fabraix.com

6 points by asfsf23423 2 months ago · 5 comments

Reader

upmind 2 months ago

Maybe unpopular opinion but I think at this point SWE-Bench has done its part and we need a new benchmark because Gemini being on/near the same level as Claude is obviously wrong

  • lern_too_spel 2 months ago

    Gemini at the same level as Claude is believable. Gemini CLI is not at the same level as Claude Code.

  • amazingamazing 2 months ago

    I use both and think they’re comparable. AMA.

    • zachdotai 2 months ago

      Not sure which version of Gemini are you using but Claude is so much better for me. Gemini is generally overeager to make a code change even when I am just asking conceptual questions, among other issues.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection