Settings

Theme

No Train No Gain:Revisiting Efficient Training Algrthm for Transformer-BasedLM

arxiv.org

11 points by froster 2 years ago · 1 comment

Reader

frosterOP 2 years ago

Recent paper highlights the difficulty of creating a new optimizer as drop-in replacement. Sophia and Lion were recently proposed as superior alternatives to Adam, but appeared worse in an independent eval

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection