EarlyOom

Karma: 205
Created: 3 years ago

Recent Submissions

1. ▲ Replace OCR with Vision Language Models (github.com) 292 points · 9 months ago · 125 comments
2. ▲ Show HN: Visually parse an entire YouTube video frame by frame (github.com) 5 points · 9 months ago · 0 comments
3. ▲ Ask HN: What are folks using to train/fine-tune Vision Language Models 1 point · 9 months ago · 0 comments
4. ▲ A Node.js SDK for calling Vision Language Models (github.com) 6 points · 9 months ago · 0 comments
5. ▲ Run structured extraction on documents/images locally with Ollama and Pydantic (github.com) 170 points · 10 months ago · 29 comments
6. ▲ Show HN: Vlm Run, Extract JSON from images, videos and documents in a simple API (vlm.run) 2 points · 1 year ago · 0 comments
7. ▲ Fine-grained Visual Transcription for YouTube videos (vlm-docs.nos.run) 9 points · 1 year ago · 3 comments
8. ▲ "Ok Computer, why are you slow?" (scottloftin.substack.com) 2 points · 1 year ago · 0 comments
9. ▲ Show HN: NOS – A fast, and ergonomic PyTorch inference server (github.com) 3 points · 2 years ago · 0 comments

All submissions on HN · View profile on HN