Settings

Theme

RedPajama at 440B tokens higher quality than Pythia and StableLM

together.xyz

6 points by jamiedg 3 years ago · 2 comments

Reader

jamiedgOP 3 years ago

A week ago we announced RedPajama, a project to create leading open-source models. We released the first step in the project a training dataset of over 1.2 trillion tokens following the LLaMA recipe.

Today we shared progress on training our first model on this dataset, a 7B parameter model using the Pythia architecture. So far we are a bit less than 50% through the training - 440B parameters. We published HELM benchmark results on 16 different scenarios for this checkpoint, showing the model accuracy to be quite high for this stage of training.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection