Apple releases FastVLM and MobileCLIP2 on HF, real-time video captioning
huggingface.coThe fastvlm seems to be just doing images, not video, like i wave my hand and it says user is holding his hand up. I think its just doing one image at a time, great speed and accuracy though just on a webgpu on my macbook air m4 Ps do u know any that does a video feed?