GLM-5-Turbo
docs.z.aiSad to see them distracted by OpenClaw given how bad GLM5 has gotten with complete radio silence from zai team.
What started as groundbreaking open source model and affordable coding plan was heavily quanted / KV cache quanted ~14 days ago leading to model bleeding thinking tokens into responses, no assistant messages, repetition loops at very low context sizes (despite claimed ~200k context).
The currently served GLM5 is _not_ the same model that was launched, benchmarked or is the open source FP8 models served on openrouter / kilo code / chutes etc..
A quick visit to their discord[1] or twitter [2] shows both the pain and radio silence on the sever GLM5 issues.
Perhaps this is a response to GLM5 being abused by OpenClaw, which is stupidly token hungry, but feels more like distraction rather than focusing on stabilizing GLM5.
[1] - https://discord.gg/QR7SARHRxK [2] - https://x.com/alkimiadev/status/2030327662808289414#m [3] - https://x.com/flickyobean/status/2029470185053405592#m
Closed Weight Only?
"Note: As an experimental version, GLM-5-Turbo is currently closed-source. All capabilities and findings will be incorporated into our next open-source model release."