Rethinking Language Model Scaling Under Transferable Hypersphere Optimization arxiv.org 2 points by matt_d 6 days ago · 0 comments Reader PiP Save No comments yet.