EarlyOom
- Karma
- 205
- Created
- 3 years ago
Recent Submissions
- 1. ▲ Replace OCR with Vision Language Models (github.com)
- 2. ▲ Show HN: Visually parse an entire YouTube video frame by frame (github.com)
- 3. ▲ Ask HN: What are folks using to train/fine-tune Vision Language Models
- 4. ▲ A Node.js SDK for calling Vision Language Models (github.com)
- 5. ▲ Run structured extraction on documents/images locally with Ollama and Pydantic (github.com)
- 6. ▲ Show HN: Vlm Run, Extract JSON from images, videos and documents in a simple API (vlm.run)
- 7. ▲ Fine-grained Visual Transcription for YouTube videos (vlm-docs.nos.run)
- 8. ▲ "Ok Computer, why are you slow?" (scottloftin.substack.com)
- 9. ▲ Show HN: NOS – A fast, and ergonomic PyTorch inference server (github.com)