Settings

Theme

MosaicML MPT-7B: A Commercially-Usable LLaMa-Quality Model

mosaicml.com

119 points by ml_hardware 3 years ago · 11 comments

Reader

jpdus 3 years ago

I wonder why this is getting so few traction here.

These models seems to beat all other available open-source models easily and the Blogpost is extremely well written, with very good documentation and fine-tuning instructions.

Well done MosaicML, I am excited what comes next and will definitely test out you platform!

  • deepsquirrelnet 3 years ago

    I'm perplexed as well. Here's a model with commercial use licensing that is competitive (better in half of the major benchmarks) with llama 7B, and has been tuned in several variants and has 2048 token width inputs.

    This is BY FAR the best model of its size that is usable by businesses. I plan to start testing it out soon.

djoldman 3 years ago

Looks competitive with LLaMa:

https://assets-global.website-files.com/61fd4eb76a8d78bc0676...

  • meghan_rain 3 years ago

    what about gpt3.5? i know it's worse but how much?

    • thewataccount 3 years ago

      Going purely by the benchmarks from OP - you can essentially consider MPT equivalent to LLaMa. It might be better/worse depending on the specific task but not by much.

      So compared to GPT3.5 - it's not great at all. That said, LLaMa showed significant improvements via fine-tuning and I expect those to apply here as well.

      EDIT: Oh I forgot this is 7B. I personally haven't spent much time with 7B llama because my hardware can do 15/30B - and honestly 15B llama is very noticably better to the point where if you can run it you shouldn't bother with 7B. So this really can't compare to GPT3.5 without finetuning and even then it'll be behind (based on llama models)

ml_hardwareOP 3 years ago

The repo for training and finetuning this model is open source here: https://github.com/mosaicml/llm-foundry

vsroy 3 years ago

This has a context window of 65K for the storywriter version.

ftxbro 3 years ago

How can I run some inference with this model locally? Do I have to make a huggingface account?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection