Minigpt4 Inference on CPU

102 points by maknee 3 years ago · 12 comments

Reader

I know it's not the main point of this, but... so many multimodal models now that take frozen vision encoders and language decoders and weld them together with a projection layer! I wanna grab the EVA02-CLIP-E image encoder and the Llama-2 33B model and do the same, I bet that'd be fun :D

famouswaffles 3 years ago

Qformer isn't necessary just to be clear. Llava is just a projection layer
GaggiX 3 years ago

Not just a projection layer but also Q-former, in this case it was already trained for that specific vision encoder but if you change it you would need to train a Q-former from scratch.
- famouswaffles 3 years ago
  
  Not for mini gpt-4 but it's just a projection layer for many others(like Llava). The Qformer isn't a necessary part of the equation.

pizzafeelsright 3 years ago

I am not an ML expert. I want to know how to add my own documents without sending them off to a 3rd party.

extasia 3 years ago

Use a local LLM like LLaMa 2.

quickthrower2 3 years ago

Not heard of minigpt4. Why that name? Is it claiming to be specifically a gpt4 competitor?

sanxiyn 3 years ago

MiniGPT-4 is a multimodal model. The name (a bad one, IMO) is a reference to GPT-4's multimodal capability.
- Zondartul 3 years ago
  
  They should rename it to ManyGPT.
WantonQuantum 3 years ago

There's more info here: https://github.com/Vision-CAIR/MiniGPT-4 (Linked in the Readme of the repo.)

Der_Einzige 3 years ago

Any data on inference speed? I’ve found that the non quantized model was much faster on GPU than the quantized versions due to lower GPU utilization.

api 3 years ago

It's a RAM tradeoff. If you have enough GPU RAM to load the non-quantized model it may be faster.

Settings

Minigpt4 Inference on CPU

Keyboard Shortcuts