Weather Watching

1 min read Original article ↗

AI is watching this Manhattan doorbell camera, counting the people walking by. It counts what they're wearing and if they have an umbrella. Sometimes the best forecast is just paying attention.

A simple person-detection AI model called YOLO ("You Only Look Once") watches the video feed continously, and draws a box around every pedestrian it sees. As people walk through the frame, it assigns each one a unique ID. It can follow people across multiple video frames, even if they briefly disappear behind someone else.

For each person, the system saves a single clear image to analyze what they're wearing.

All those "best shot" frames get sent in a batch to a LLM, specifically Google's Gemini, a multimodal AI that can interpret images. It looks at each person and answers a few simple questions: shorts or pants? Short sleeves or long sleeves? Carrying an umbrella?

The answers get tallied across everyone who walked by. The crowd is producing a weather report for you.

Thanks to Brian for technical pointers, and James for the idea to do clothes in addition to umbrellas!