Seven Models in Three Weeks: China's AI Labs Aren't Waiting

7min.ai

2 points by fabioperez 2 months ago · 1 comment

Reader

fabioperezOP 2 months ago

Chinese labs shipped seven major models in the past three weeks:

Moonshot AI → Kimi K2.5 (coordinates 100 sub-agents in parallel)

z.ai → GLM-5 (lowest hallucination rate on Artificial Analysis, runs on Huawei chips)

MiniMax → M2.5 (80.2% on SWE-bench, claims ~1/10th cost of Claude Opus per task)

ByteDance → Seedance 2.0 (4K video) + Seed 2.0 (powers Doubao, 155M weekly users)

Kuaishou → Kling 3.0 (native 4K 60fps video)

Alibaba → Qwen 3.5 (397B/17B MoE, claims to beat GPT-5.2 on 80% of benchmarks)

Four of five text models are open-weight under MIT or Apache 2.0. All use MoE architectures. All under $1/M input tokens. For comparison: Claude Opus is $5 and GPT-5.2 is $1.75.

The other thing worth paying attention to: every lab is building for agents now, not chatbots. Kimi K2.5 runs 100 sub-agents in parallel. Qwen 3.5 controls apps from screenshots. ByteDance calls Seed 2.0 their "agent era" model.

Most of these scores are vendor-reported, so grain of salt. But even discounting the benchmarks by 10-15%, the pricing difference is hard to explain away.

So what actually justifies paying 5-10x more for Western models? Reliability? Safety? And honestly, how much do you trust vendor-reported benchmarks here?

Curious to see if anyone has compared the Chinese models with Opus 4.6 or GPT-5.2 to see how well they do.

Settings

Seven Models in Three Weeks: China's AI Labs Aren't Waiting

Keyboard Shortcuts