SWE-AGI Leaderboard

2 min read Original article ↗

103-104

Typical Core LOC per Task

Behavior Composition by Model

Each model bar is normalized to 100%. Color encodes behavior category; hover segments to inspect percentage and raw action counts.

Model Agent Organization Tasks Passed Test Case Pass Rate Total Cost Total Time
GPT-5.3 Codex Codex CLI OpenAI 19/22

95.6%

$213.07 24.8h
GPT-5.2 Codex Codex CLI OpenAI 17/22

96.4%

$435.72 108.6h
Claude Opus 4.6 Claude Code Anthropic 15/22

90.8%

$2055.81 76.4h
Claude Opus 4.5 Claude Code Anthropic 10/22

81.7%

$507.94 26.8h
Gemini 3 Flash Gemini CLI Google 2/6

49.8%

$31.61 1.5h
GLM-4.7 Claude Code Zhipu AI 2/6

64.2%

$4.86 4.2h
Kimi K2.5 Kimi Code CLI Moonshot AI 2/6

92.0%

N/A 5.9h
DeepSeek V3.2 Claude Code DeepSeek 1/6

16.7%

$4.12 20.2h
Claude Sonnet 4.5 Claude Code Anthropic 0/6

76.1%

$40.67 1.9h
Gemini 3 Pro Gemini CLI Google 0/6

16.5%

N/A 1.8h
Qwen3 Max Claude Code Alibaba 0/6

13.9%

$368.37 15.5h

Easy Tier

Model Agent Tasks Passed Test Case Pass Rate Avg Time Avg LOC Cost
Claude Opus 4.5 Claude Code 6/6

100.0%

0.39h 1092 $56.69
Claude Opus 4.6 Claude Code 6/6

100.0%

0.45h 1781 $48.61
Claude Sonnet 4.5 Claude Code 0/6

76.1%

0.32h 930 $40.67
DeepSeek V3.2 Claude Code 1/6

16.7%

3.4h 1070 $4.12
Gemini 3 Flash Gemini CLI 2/6

49.8%

0.25h 558 $31.61
Gemini 3 Pro Gemini CLI 0/6

16.5%

0.30h 710 N/A
GLM-4.7 Claude Code 2/6

64.2%

0.70h 904 $4.86
GPT-5.2 Codex Codex CLI 6/6

100.0%

0.81h 1081 $33.51
GPT-5.3 Codex Codex CLI 6/6

100.0%

0.28h 1305 $15.00
Kimi K2.5 Kimi Code CLI 2/6

92.0%

0.99h 1163 N/A
Qwen3 Max Claude Code 0/6

13.9%

2.6h 850 $368.37

Medium Tier

Model Agent Tasks Passed Test Case Pass Rate Avg Time Avg LOC Cost
Claude Opus 4.5 Claude Code 3/8

82.6%

1.3h 3304 $208.43
Claude Opus 4.6 Claude Code 5/8

93.6%

3.5h 4867 $1183.94
GPT-5.2 Codex Codex CLI 7/8

98.9%

5.1h 4702 $287.17
GPT-5.3 Codex Codex CLI 8/8

100.0%

1.2h 2575 $114.14

Hard Tier

Model Agent Tasks Passed Test Case Pass Rate Avg Time Avg LOC Cost
Claude Opus 4.5 Claude Code 1/8

67.0%

1.7h 6603 $242.82
Claude Opus 4.6 Claude Code 4/8

81.2%

5.7h 10103 $823.26
GPT-5.2 Codex Codex CLI 4/8

91.2%

7.8h 9034 $115.04
GPT-5.3 Codex Codex CLI 5/8

87.9%

1.7h 6255 $83.94