Settings

Theme

Minigpt4 Inference on CPU

github.com

102 points by maknee 2 years ago · 12 comments

Reader

heyitsguay 2 years ago

I know it's not the main point of this, but... so many multimodal models now that take frozen vision encoders and language decoders and weld them together with a projection layer! I wanna grab the EVA02-CLIP-E image encoder and the Llama-2 33B model and do the same, I bet that'd be fun :D

  • famouswaffles 2 years ago

    Qformer isn't necessary just to be clear. Llava is just a projection layer

  • GaggiX 2 years ago

    Not just a projection layer but also Q-former, in this case it was already trained for that specific vision encoder but if you change it you would need to train a Q-former from scratch.

    • famouswaffles 2 years ago

      Not for mini gpt-4 but it's just a projection layer for many others(like Llava). The Qformer isn't a necessary part of the equation.

pizzafeelsright 2 years ago

I am not an ML expert. I want to know how to add my own documents without sending them off to a 3rd party.

quickthrower2 2 years ago

Not heard of minigpt4. Why that name? Is it claiming to be specifically a gpt4 competitor?

Der_Einzige 2 years ago

Any data on inference speed? I’ve found that the non quantized model was much faster on GPU than the quantized versions due to lower GPU utilization.

  • api 2 years ago

    It's a RAM tradeoff. If you have enough GPU RAM to load the non-quantized model it may be faster.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection