Home

1 min read Original article ↗

mapcv turns a bounding box and a set of polygon labels into a ready-to-train image segmentation dataset. It fetches XYZ map tiles, rasterizes KML or GeoJSON annotations onto the tile grid, extracts fixed-size image/mask patches, and writes them to disk. To ensure a lightweight footprint and easy installation, the tool is GDAL-free. Tile stitching, rasterization, and patch sampling are powered by Rust, accessible through the CLI (or via the Python API).

Rust-accelerated core

Tile stitching, polygon rasterization, and patch sampling all run in compiled Rust via PyO3, which is orders of magnitude faster than other implementations.

Many tile sources

Built-in support for Esri World Imagery, Google Satellite, OpenStreetMap, CartoDB basemaps, and any custom XYZ URL template.

KML & GeoJSON labels

Parse polygon annotations directly from KML or GeoJSON files. Multiclass labels via a configurable field name. Automatic WGS-84 to Web Mercator projection.

Flexible patch sampling

Grid or random sampling with configurable stride, edge strategies (pad / drop / shift), and empty-patch filtering.

Train / val / test splits

Stratified or random dataset splitting from the JSON manifest. Configurable labeled-data fractions for semi-supervised workflows.

Simple CLI

Run the full pipeline with mapcv generate config.yaml. One command, one config file, ready-to-train output.