Creating Training Data
Use AI Segmentation to quickly create labeled training polygons for machine learning models and image classification workflows.
The Challenge
Geospatial ML models (object detection, land cover classification, change detection) require labeled training data: polygons representing the objects you want the model to recognize.
Manually tracing building footprints, tree canopies, or field boundaries can take hours or days.
How AI Segmentation Helps
The plugin speeds up annotation from hours to minutes:
- Point and click on an object to label
- The AI generates a precise polygon in real time
- Refine with positive/negative clicks if needed
- Save and move to the next object
- Export all polygons as a GeoPackage
Typical Workflow
Step 1: Prepare Your Imagery
Load your raster layer in QGIS (drone orthomosaics, satellite imagery, aerial photos, or online tile layers).
Step 2: Segment Sample Objects
Use the plugin to segment representative examples of each class you want to detect. For example, if you are building a tree detection model, segment 50-100 trees across your study area.
For better model performance, sample across different conditions: varying sizes, lighting, and surrounding context. Do not sample from one area only.
Step 3: Add Class Labels
After exporting, open the attribute table and add a column for your class labels. For example, "tree", "building", "road", "water". You can also use numeric codes (1, 2, 3, 4) depending on your ML framework.
Step 4: Export for Your ML Pipeline
Export the labeled layer as a Shapefile, GeoJSON, or GeoPackage. Most geospatial ML tools (QGIS SCP, rasterio, geopandas, deep learning frameworks) read these formats directly.
What You Can Train
- Object detection: buildings, trees, vehicles, solar panels
- Land cover classification: vegetation, water, urban, bare soil
- Change detection: comparing features over time
- Semantic segmentation: pixel-level classification from polygon labels
Complementary Tools
- QGIS Semi-Automatic Classification Plugin (SCP): for supervised classification using spectral signatures
- Python ML libraries: scikit-learn, TensorFlow, PyTorch with geospatial data loaders
- Google Earth Engine: for large-scale classification tasks