Laguna XS.2Poolside256K ctxOpen weightLow confidenceSparse benchmark evidence
81
70
MiMo-V2.5Xiaomi1M ctxLow confidenceNot on BenchLM's leaderboard for this metric
68
Gemma 4 31BGoogle256K ctxOpen weightLow confidenceSparse benchmark evidence
68
Mistral Small 4Mistral256K ctxOpen weightLow confidenceNo trusted benchmarks for this metric
67
Grok 4.3xAI1M ctxLow confidenceSparse benchmark evidence
66
Ling 2.6 FlashInclusionAI262K ctxOpen weightLow confidenceSparse benchmark evidence
66
Step 3.7 FlashStepFun256K ctxOpen weightLow confidenceSparse benchmark evidence
63
MiMo-V2.5-ProXiaomi1M ctxLow confidenceSparse benchmark evidence
63
Phi-4Microsoft16K ctxOpen weight
62
62
61
60
60
59
Hy3 PreviewTencent256K ctxOpen weightLow confidenceNot on BenchLM's leaderboard for this metric
59
59
59
59
Gemma 4 26B A4BGoogle256K ctxOpen weightLow confidenceSparse benchmark evidence
59
58
Laguna M.1Poolside256K ctxLow confidenceSparse benchmark evidence
58
GLM-4.7Z.AI200K ctxOpen weight
58
58
57
57
57
55
55
55
54
54
54
GPT-5.4 miniOpenAI400K ctxLow confidenceNot on BenchLM's leaderboard for this metric
54
54
54
53
53
53
53
53
53
GLM-5.1Z.AI203K ctxOpen weight
52
Kimi K2.6Moonshot AI256K ctxOpen weight
52
GPT-5.4 nanoOpenAI400K ctxLow confidenceNot on BenchLM's leaderboard for this metric
52
52
52
51
50
49
49
Qwen3.5 FlashAlibaba1M ctxLow confidenceNo trusted benchmarks for this metric
49
49
Mistral Medium 3Mistral128K ctxLow confidenceNo trusted benchmarks for this metric
49
49
Kimi K2.5Moonshot AI256K ctxOpen weight
49
48
o3OpenAI200K ctx
48
48
45
44
44
44
43
GLM-5Z.AI200K ctxOpen weight
42
40
GPT-5.5 ProOpenAI1M ctxLow confidenceSparse benchmark evidence
39
38
38
37
Command A+Cohere128K ctxOpen weightLow confidenceSparse benchmark evidence
36
35
35
32
28
28
27
26
21
o1OpenAI200K ctx
20
4