Settings

Theme

Carbs, a hyperparameter optimizer that scales small experiments to LLMs

imbue.com

4 points by thejash 2 years ago · 1 comment

Reader

thejashOP 2 years ago

We successfully scaled from a 7B run to a 70B run on the first try, with minimal training instability and no loss spikes. We also predicted performance of the 70B model based on experiment results from much smaller models.

We accomplished this using our hyperparameter optimizer, CARBS. We’re open-sourcing CARBS today so that other small teams experimenting with novel model architectures can experiment at small scale and trust performance at large scale.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection