Settings

Theme

Apple/OpenELM: Efficient Open-Source Family Language Models

huggingface.co

82 points by panqueca 2 years ago · 15 comments

Reader

Roshni1990r 2 years ago

OpenELM, a family of Efficient Language Models Developed by Apple, is trending on Hugging Face!

OpenELM offers models with 270M to 3B parameters, pre-trained and instruction-tuned, with Good results across various benchmarks.

My Feedback:

First Phi 3, now OpenELM. It's great to see these small models improving. I know they're not ready for production in all cases, but they're really great for specific tasks.

I see small open-source models as the future because they offer better speed, require less compute, and use fewer resources, making them more accessible and practical for a wider range of applications.

What do you think about this? Do you consider using small opensource. If yes what you are thinking to make?

I am going to use it on my smartphone

monkeydust 2 years ago

https://github.com/apple/corenet/tree/main/projects/openelm

panquecaOP 2 years ago

ArXiv Paper:

https://arxiv.org/abs/2404.14619

unraveller 2 years ago

Why'd it drop today? One supposes that instead of pressing shift+delete on their repo they click publish now so they get to write the headline that 2 big tech companies release small language models on the same day.

  • orra 2 years ago

    I presume they're releasing it because they traibed it using the just-announced CoreNet library.

    However, the model is proprietary. I'm tired of the open washing.

    • orra 2 years ago

      I retract the claim of proprietary. I misunderstood some of the licence wording. The license appears to be well accepted as a permissive open source license. https://spdx.org/licenses/AML.html

      • vulcan01 2 years ago

        Yep, AML is basically MIT + some stuff about patents. One wonders why they do not use the Apache license instead.

buildbot 2 years ago

Huh, They used the pile - that's a pretty interesting choice for a corporate research team?

gnabgib 2 years ago

Article title(h1): OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework

sunflowerfly 2 years ago

Any idea how much ram this requires?

  • vineyardmike 2 years ago

    The model sizes are:

    > 270M, 450M, 1.1B and 3B parameters

    Which roughly translates to 3GB for the highest end one, depending on context length used.

    • SushiHippie 2 years ago

      * ~3GB with 8bit quantization. Without quantization it is ~6GB [0].

      8 bits = 1 byte

      3 billion * 1 byte = 3 gigabyte

      + Some memory for the context of the LLM

      [0]

      3b-instruct has a total file size of 4.94GB + 1.13GB which is 6.07GB which can be seen here:

      https://huggingface.co/apple/OpenELM-3B-Instruct/tree/main

      A bit of overhead will always be there, as you probably want to store some metadata next to the raw weights.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection