Beyond 80/20: High-Entropy Minority Tokens Drive Effective RL for LLM Reasoning arxiv.org 3 points by mdp2021 19 days ago · 0 comments Reader PiP Save No comments yet.