Creating Training Data

Use AI Segmentation to quickly create labeled training polygons for machine learning models and image classification workflows.

The Challenge

Geospatial ML models (object detection, land cover classification, change detection) require labeled training data: polygons representing the objects you want the model to recognize.

Manually tracing building footprints, tree canopies, or field boundaries can take hours or days.

How AI Segmentation Helps

The plugin speeds up annotation from hours to minutes:

  1. Point and click on an object to label
  2. The AI generates a precise polygon in real time
  3. Refine with positive/negative clicks if needed
  4. Save and move to the next object
  5. Export all polygons as a GeoPackage

Typical Workflow

Step 1: Prepare Your Imagery

Load your raster layer in QGIS (drone orthomosaics, satellite imagery, aerial photos, or online tile layers).

Step 2: Segment Sample Objects

Use the plugin to segment representative examples of each class you want to detect. For example, if you are building a tree detection model, segment 50-100 trees across your study area.

For better model performance, sample across different conditions: varying sizes, lighting, and surrounding context. Do not sample from one area only.

Step 3: Add Class Labels

After exporting, open the attribute table and add a column for your class labels. For example, "tree", "building", "road", "water". You can also use numeric codes (1, 2, 3, 4) depending on your ML framework.

Step 4: Export for Your ML Pipeline

Export the labeled layer as a Shapefile, GeoJSON, or GeoPackage. Most geospatial ML tools (QGIS SCP, rasterio, geopandas, deep learning frameworks) read these formats directly.

What You Can Train

  • Object detection: buildings, trees, vehicles, solar panels
  • Land cover classification: vegetation, water, urban, bare soil
  • Change detection: comparing features over time
  • Semantic segmentation: pixel-level classification from polygon labels

Complementary Tools

  • QGIS Semi-Automatic Classification Plugin (SCP): for supervised classification using spectral signatures
  • Python ML libraries: scikit-learn, TensorFlow, PyTorch with geospatial data loaders
  • Google Earth Engine: for large-scale classification tasks