Settings

Theme

GLM-4-9B: open-source model with superior performance to Llama-3-8B

github.com

66 points by marcelsalathe 2 years ago · 17 comments

Reader

ilaksh 2 years ago

Looks like terrific technology. However, the translation says that it's an "irrevocable revocable" non-commercial license with a form to apply for commercial use.

  • pilotneko 2 years ago

    That's weird, because the repository states Apache 2.0. https://github.com/THUDM/GLM-4/blob/main/LICENSE

    Oops, you are right. The code is Apache 2.0, license for the model weights is separate.

  • mikeqq2024 2 years ago

    "non-exclusive, worldwide, irrevocable, non-sublicensable, revocable, photo-free copyright license."

    Translation error? output from GPT: "non-exclusive, global, non-transferable, non-sublicensable, revocable, royalty-free license."

great_psy 2 years ago

I’m excited to hear work is being done on models that support function calling natively.

Does anybody know if performance could be greatly increased if only a single language was supported ?

I suspect there’s a high demand for models that are maybe smaller and can run faster if the tradeoff is support for only English.

Is this available in ollama ?

  • freeqaz 2 years ago

    Are there any other models that support function calling?

    • wtarreau 2 years ago

      I ran some tests on phi-3 and mistral-7b and it's not very hard to teach them to use tools, even though they were not designed for this. It turns out these models obey their instructions quite well and when you explain them that if they need to look up data on the net or to perform a calculation, they must formulate this demand with a specific syntax, they do a pretty good job. You just have to enable reverse-prompting so that the evaluation stops after their demand, your tools do the job (or you simulate it manually) and their task continues.

abrichr 2 years ago

> GLM-4V-9B possesses dialogue capabilities in both Chinese and English at a high resolution of 1120*1120. In various multimodal evaluations, including comprehensive abilities in Chinese and English, perception & reasoning, text recognition, and chart understanding, GLM-4V-9B demonstrates superior performance compared to GPT-4-turbo-2024-04-09, Gemini 1.0 Pro, Qwen-VL-Max, and Claude 3 Opus.

But according to their own evaluation further down, gpt-4o-2024-05-13 outperforms GLM-4V-9B on every task except OCRBench.

  • czl 2 years ago

    Based on size (parameter count) they are in different categories. GLM-4V-9B is in the "light-weight" competition. gpt-4o-2024-05-13 is in the medium or heavy "weight" competition.

norwalkbear 2 years ago

Isnt 3-70b so good, reddit llamaers are saying people should buy hardware to run it?

Llama-3-8b was garbage for me but damn 70b is good enough

  • reaperman 2 years ago

    The unquantized llama 70B requires 142GB of VRAM. Some of the quantized versions are quite decent but they do tend to get overquantized below around 26.5GB of VRAM (~3 bits per weight).

    So you’d at minimum be looking at dual 3090 with NVLink for about $4000 or so. Or for the highest performing non-quantized model, you’d be spending about $40,000 for two A100’s.

oarth 2 years ago

If those numbers are true then it's very impressive. Hoping for llama.cpp support.

nubinetwork 2 years ago

1M context, but does it really? I've been hit with 32K models that crap out after 10K before...

fragmede 2 years ago

model available, not open source.

refulgentis 2 years ago

Ehhh man this is frustrating, 7B was a real sweet spot for hobbyist. 8B...doable. I've been joking to myself/simultaneously worried that Llama 3 8B and Phi-3 "3B" (3.8B) would start a "ehhh, +1, might as well be a rounding error" thing. It's a big deal! I measure a 33% decrease just going from 3B to 3.8B when inferencing on CPU.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection