Settings

Theme

Show HN: Smart glasses that tell me when to stop pouring

github.com

5 points by tash_2s a month ago · 7 comments · 2 min read

Reader

I've been experimenting with a more proactive AI interface for the physical world.

This project is a drink-making assistant for smart glasses. It looks at the ingredients, selects a recipe, shows the steps, and guides me in real time based on what it sees. The behavior I wanted most was simple: while I'm pouring, it should tell me when to stop, instead of waiting for me to ask.

The demo video is at the top of the README.

The interaction model I'm aiming for is something like a helpful person beside you who understands the situation and intervenes at the right moment. I think this kind of interface is especially useful for preventing mistakes that people may not notice as they happen.

The system works by running Qwen3.5-27B continuously on the latest 0.5-second video clip every 0.5 seconds. I used Overshoot (https://overshoot.ai/) for fast live-video VLM inference. Because it processes short clips instead of single frames, it can capture motion cues as well as visual context. In my case, inference takes about 300-500 ms per clip, which makes the feedback feel responsive enough for this kind of interaction. Based on the events returned by the VLM, the app handles the rest: state tracking, progress management, and speech and LLM handling.

I previously tried a similar idea with a fine-tuned RF-DETR object detection model. That approach is better on cost and could also run on-device. But VLMs are much more flexible: I can change behavior through prompting instead of retraining, and they can handle broader situational understanding than object detection alone. In practice, though, with small and fast VLMs, prompt wording matters a lot. Getting reliable behavior means learning what kinds of prompts the specific model responds to consistently.

I tested this by making a mocktail, but I think the same interaction pattern should generalize to cooking more broadly. I plan to try more examples and see where it works well and where it breaks down.

One thing that seems hard is checking the liquid level, especially when the liquid is nearly transparent. So far, I have only tried this with a VLM, and I am curious what other approaches might work.

Questions and feedback welcome.

Paulo75 25 days ago

Cool project. Beyond cocktails, I could see something like this being useful for people who want to monitor their alcohol intake before driving - pour tracking that keeps a running count and warns you when you're approaching your limit. The "proactive intervention" model fits that perfectly.

  • tash_2sOP 24 days ago

    Yeah, having a system that understands the situation and intervenes could be useful in a lot of cases.

Eawrig05 a month ago

Wonderful! Curious to understand how low light levels affect this, bars are often low light.

  • tash_2sOP a month ago

    Thanks! I need to test it in low-light environments, and making it work reliably across a variety of kitchen settings is definitely one of my goals.

dylanhouli 25 days ago

Really cool. Makes me excited to see other uses for smart glasses in the future.

  • tash_2sOP 25 days ago

    I feel the same. As AI gets better, I think personal AI for the physical world will become much more useful, and glasses could be a natural interface for it.

stevewave713 a month ago

I have been working on structured prompt templates for different use cases. The biggest improvement I found was using context-first prompting and explicit format specifications. Compiled 30 templates at stevewave713.gumroad.com if anyone wants to check them out.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection