Muon Is Scalable for LLM Training github.com 5 points by renonce 10 months ago · 1 comment Reader PiP Save yorwba 10 months ago For people who want to know more about the Muon optimizer: https://kellerjordan.github.io/posts/muon/