Models Pie · Compare & Rank LLMs by Cost, Speed & Quality

1 min read Original article ↗
1

Laguna XS.2Poolside256K ctxOpen weightLow confidenceSparse benchmark evidence

81

2

70

3

MiMo-V2.5Xiaomi1M ctxLow confidenceNot on BenchLM's leaderboard for this metric

68

4

Gemma 4 31BGoogle256K ctxOpen weightLow confidenceSparse benchmark evidence

68

5

Mistral Small 4Mistral256K ctxOpen weightLow confidenceNo trusted benchmarks for this metric

67

6

Grok 4.3xAI1M ctxLow confidenceSparse benchmark evidence

66

7

Ling 2.6 FlashInclusionAI262K ctxOpen weightLow confidenceSparse benchmark evidence

66

8

Step 3.7 FlashStepFun256K ctxOpen weightLow confidenceSparse benchmark evidence

63

9

MiMo-V2.5-ProXiaomi1M ctxLow confidenceSparse benchmark evidence

63

10

Phi-4Microsoft16K ctxOpen weight

62

11

62

12

61

13

60

14

60

15

59

16

Hy3 PreviewTencent256K ctxOpen weightLow confidenceNot on BenchLM's leaderboard for this metric

59

17

59

18

59

19

59

20

Gemma 4 26B A4BGoogle256K ctxOpen weightLow confidenceSparse benchmark evidence

59

21

58

22

Laguna M.1Poolside256K ctxLow confidenceSparse benchmark evidence

58

23

GLM-4.7Z.AI200K ctxOpen weight

58

24

58

25

57

26

57

27

57

28

55

29

55

30

55

31

54

32

54

33

54

34

GPT-5.4 miniOpenAI400K ctxLow confidenceNot on BenchLM's leaderboard for this metric

54

35

54

36

54

37

53

38

53

39

53

40

53

41

53

42

53

43

GLM-5.1Z.AI203K ctxOpen weight

52

44

Kimi K2.6Moonshot AI256K ctxOpen weight

52

45

GPT-5.4 nanoOpenAI400K ctxLow confidenceNot on BenchLM's leaderboard for this metric

52

46

52

47

52

48

51

49

50

50

49

51

49

52

Qwen3.5 FlashAlibaba1M ctxLow confidenceNo trusted benchmarks for this metric

49

53

49

54

Mistral Medium 3Mistral128K ctxLow confidenceNo trusted benchmarks for this metric

49

55

49

56

Kimi K2.5Moonshot AI256K ctxOpen weight

49

57

48

58

o3OpenAI200K ctx

48

59

48

60

45

61

44

62

44

63

44

64

43

65

GLM-5Z.AI200K ctxOpen weight

42

66

40

67

GPT-5.5 ProOpenAI1M ctxLow confidenceSparse benchmark evidence

39

68

38

69

38

70

37

71

Command A+Cohere128K ctxOpen weightLow confidenceSparse benchmark evidence

36

72

35

73

35

74

32

75

28

76

28

77

27

78

26

79

21

80

o1OpenAI200K ctx

20

81

4