MPT-30B: Raising the bar for open-source foundation models

34 points by hansonw 3 years ago · 2 comments

Reader

It's interesting that they've appeared to have undertrained their 30B model at least compared to LLama/Falcon.

The coding ability performed better, but it's still far behind WizardCoder which is half the size - of course WizardCoder wasn't released why they started training MPT-30B.

The 8k context is an interesting addition. Are there any standard benchmarks to show how coherently models perform at different context lengths - 1k, 2k, 4k, 8k, etc?

ShamelessC 3 years ago

Correction: Foundation _Series_ models

Settings

MPT-30B: Raising the bar for open-source foundation models

Keyboard Shortcuts