Settings

Theme

MPT-30B: Raising the bar for open-source foundation models

mosaicml.com

34 points by hansonw 3 years ago · 2 comments

Reader

thewataccount 3 years ago

It's interesting that they've appeared to have undertrained their 30B model at least compared to LLama/Falcon.

The coding ability performed better, but it's still far behind WizardCoder which is half the size - of course WizardCoder wasn't released why they started training MPT-30B.

The 8k context is an interesting addition. Are there any standard benchmarks to show how coherently models perform at different context lengths - 1k, 2k, 4k, 8k, etc?

ShamelessC 3 years ago

Correction: Foundation _Series_ models

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection