Supervised classification is a technique for extracting
information from image data. The goal is to classify pixels in an image
into different classes based on features of the pixels. There are two
stages: training stage and classification stage. During the training
stage, a set of vectors (each vector is associated with a pixel) called
training samples are used to train a classifier. Each training sample
vector is made up of the class the pixel belongs to and feature values
of the pixel. In the classification stage, the trained classifier is
used to classify pixels with known feature values but unknown class.
User has a choice of
- train (and save) a
classifier; or
- load a previously saved classifier
to perform classification.
To do the former, type in a new filename.
To do the latter, select a filename from the list.
Train and save a classifier
User is required to provide the number of training samples to use.
Note that in
addition to training, evaluation of the classifier is also performed.
E.g., if the number of training samples to use is 5000, then 5000 x 2 =
10,000 samples will be extracted. The first 5000 will be used as
training samples and the remaining 5000 will be used as test samples
for evaluation.
Two
separate files are created: one with extension .class and one with
extension .xml. The evaluation results are in a file with
extension .txt.
User can choose to train on raster or vectors.
Train on Raster
The
user can choose one band (from the first product listed in the
ProductSet-Reader) as the training band. If none is chosen, the first
band will be used as the training band.
The user can choose
bands from all the source products as feature bands. If none is chosen,
all bands (except for the training band) will be used as feature
bands.
There is an option to quantize class values if the values of the chosen training band are not already discrete.
If the training band consists of data that is discrete labels such as landcover classes, then there is no need to quantize.
However,
if the training band data is continuous like biomass, then there will
be as many classes as there are biomass values in the training set. It
is recommended to quantize the values in such cases.
E.g., if
the range of values in the training band is [0.0, 1.0], the user can
set min class value to 0.0, class value step size to 0.1 and class
levels to 10 to quantize the values to 10 levels: 0.0, 0.1, 0.2, 0.3,
0.4, 0.5, 0.6, 0.7, 0.8, 0.9.
Train on Vectors
The
user can choose a number of training vectors (from the first product
listed in the ProductSet-Reader) as classes. E.g., the training vectors
could be regions (polygons) each representing a separate class such as
water, urban or forest. A training vector called "water" will become a
class label called "water".
Regions can be created using the "New Vector Data Container" tool and other drawing tools such as "Rectangle drawing tool".
A pixel inside a training vector region will have the name of the region as its class instead of its data value.
Feature bands are chosen in the same manner as train on raster.
The
operator will endeavour to extract the same number samples for each
class when constructing the training or test samples set.
Load a previously saved classifier
The minimum information the user needs to know to use a saved
classifier is the list of features which is contained in the XML
file among other useful information.
The user can specify more than one source feature product.
For
each name in "featureNames" in the XML file, the operator will search
for a band in the feature products whose name contains it. It
loops through the products in the order they are listed in ProductSet-Reader and uses the first band it can find
that contains the feature name. E.g., if the name is "g0" and there are
two feature products and both contain a band named "g0", then the band
from the first feature product will be used.