Settings

Theme

Show HN: From Photos to Positions: Prototyping VLM-Based Indoor Maps

arjo129.github.io

55 points by accurrent 7 months ago · 2 comments · 1 min read

Reader

Just a fun hack I did while bored over the weekend. My wife was busy shopping, it got me thinking that can VLMs solve the indoor location problem in a mall? Can I just show a VLM a map and an image and have it doa good enough job locating me? I hacked this P.O.C and it seems to work.

rohanrao123 7 months ago

Pretty cool! It reminded me of this work from NVIDIA Research - https://nvidia-ai-iot.github.io/remembr where they used VLMs and RAG on top of a real robot to navigate the Voyager campus in Santa Clara. You also might like the new OpenAI o3 models and how well they can play GeoGuessr ;)

https://simonwillison.net/2025/Apr/26/o3-photo-locations, https://news.ycombinator.com/item?id=43835044, https://www.astralcodexten.com/p/testing-ais-geoguessr-geniu...

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection