Settings

Theme

Show HN: Photon – Rust pipeline that embeds/tags/hashes images locally w SigLIP

github.com

3 points by pgbouncer a month ago · 1 comment · 1 min read

Reader

Open-source Rust-based image processing pipeline that takes images and outputs structured JSON — 768-dim vector embeddings, semantic tags from a 68K-term vocabulary, EXIF metadata, content hashes, and thumbnails.

Everything runs locally via SigLIP + ONNX Runtime. Single binary, no Python, no Docker, no cloud dependency. Optional BYOK LLM descriptions (Ollama, Anthropic, OpenAI).

pgbouncerOP a month ago

How tagging works: SigLIP (Google's CLIP successor) runs locally through ONNX Runtime. Image embeddings are scored against a 68,000-term vocabulary (pulled from WordNet nouns) via dot product + sigmoid scaling. A self-organizing relevance system adapts the vocabulary to your dataset i.e. frequently matched terms get promoted, irrelevant ones demote to a cold pool. So a photo of red sneakers gets tagged sneakers, footwear, red, fashion without any training or finetuning.

The progressive encoding system should take the 90-minute cold start (encoding 68K text terms through SigLIP) down to ~30 seconds by encoding a seed vocabulary first, then background-encoding the rest while you're already processing images.

It's pure Rust, single binary, pip install photon-imager or build from source.

Would love feedback, contributions, and forks. Some areas where help would be especially welcome: - Windows support (currently macOS + Linux only) - Additional model backends beyond SigLIP - Frontend/UI for browsing tagged collections - Database integration examples (pgvector, Qdrant, etc.)

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection