Supervised Classification

Supervised classification is a technique for extracting information from image data. The goal is to classify pixels in an image into different classes based on features of the pixels. There are two stages: training stage and classification stage. During the training stage, a set of vectors (each vector is associated with a pixel) called training samples are used to train a classifier. Each training sample vector is made up of the class the pixel belongs to and feature values of the pixel. In the classification stage, the trained classifier is used to classify pixels with known feature values but unknown class.

User has a choice of
to perform classification.

To do the former, type in a new filename.

To do the latter, select a filename from the list.


Train and save a classifier

User is required to provide the number of training samples to use.

Note that in addition to training, evaluation of the classifier is also performed. E.g., if the number of training samples to use is 5000, then 5000 x 2 = 10,000 samples will be extracted. The first 5000 will be used as training samples and the remaining 5000 will be used as test samples for evaluation.

Two separate files are created: one with extension .class and one with extension .xml.  The evaluation results are in a file with extension .txt.

User can choose to train on raster or vectors.

Train on Raster

The user can choose one band (from the first product listed in the ProductSet-Reader) as the training band. If none is chosen, the first band will be used as the training band.

The user can choose bands from all the source products as feature bands. If none is chosen,  all bands (except for the training band) will be used as feature bands.

There is an option to quantize class values if the values of the chosen training band are not already discrete.

If the training band consists of data that is discrete labels such as landcover classes, then there is no need to quantize.

However, if the training band data is continuous like biomass, then there will be as many classes as there are biomass values in the training set. It is recommended to quantize the values in such cases.

E.g., if the range of values in the training band is [0.0, 1.0], the user can set min class value to 0.0, class value step size to 0.1 and class levels to 10 to quantize the values to 10 levels: 0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9.

Train on Vectors

The user can choose a number of training vectors (from the first product listed in the ProductSet-Reader) as classes. E.g., the training vectors could be regions (polygons) each representing a separate class such as water, urban or forest. A training vector called "water" will become a class label called "water".

Regions can be created using the "New Vector Data Container" tool and other drawing tools such as "Rectangle drawing tool".

A pixel inside a training vector region will have the name of the region as its class instead of its data value.

Feature bands are chosen in the same manner as train on raster.

The operator will endeavour to extract the same number samples for each class when constructing the training or test samples set.


Load a previously saved classifier

The minimum information the user needs to know to use a saved classifier is the list of features which is contained in the XML file among other useful information.

The user can specify more than one source feature product.

For each name in "featureNames" in the XML file, the operator will search for a band in the feature products whose name contains it. It loops through the products in the order they are listed in ProductSet-Reader and uses the first band it can find that contains the feature name. E.g., if the name is "g0" and there are two feature products and both contain a band named "g0", then the band from the first feature product will be used.