Settings

Theme

Show HN: Build ML training datasets from large-scale satellite/aerial imagery

github.com

2 points by noahgolmant 11 days ago · 0 comments · 2 min read

Reader

This is a small tool to label bounding boxes on satellite/aerial imagery and export training datasets for object detection.

Web maps like Google Earth work by stitching together lots of small images called tiles (this is why you see square patches as the page loads). They do this by querying a "tile server" API that reads from sharded raster files in cloud storage. In my day job we built infra to efficiently serve imagery through tile servers for map visualization. I wanted to test out ML applications of that infra. It turns out this standard can also be leveraged to label and fine-tune models on map imagery.

This tool lets you point at any tile server URL, draw labeled bounding boxes, and export labels in COCO annotation format, plus download underlying tile PNGs for training/inference. This can feed directly into standard computer vision frameworks like ultrayltics or pytorch.

The workflow is hotkey-driven: draw a box, press 1-9 to assign a category (or N for negative examples). You can also drag-and-drop local GeoTIFFs.

I found this helpful to experiment with fine-tuning SAM 3 on local aerial imagery. It was nice to zip up the PNGs + COCO file, drag and drop to a colab notebook, and run inference.

There are other interesting applications of this I'd like to explore, like in-browser map-based segmentation / object detection with onnx.

No comments yet.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection