Muon Is Scalable for LLM Training github.com 5 points by renonce a year ago · 1 comment Reader PiP Save yorwba a year ago For people who want to know more about the Muon optimizer: https://kellerjordan.github.io/posts/muon/