Classification Tutorial - MultiSpec 32 PC
Modified from the GLOBE Toolkit Unsupervised Clustering Tutorial
To use this tutorial, you will need one image file; corpus_christi_600X600.lan.
These should be included in your MultiSpec \images directory.
Select the folder "MultiSpec." The files, bevsub.lan and bevsub.cls
will work with either the Macintosh or PC versions of MultiSpec.
Objectives:
The principal objective of this exercise is to provide expertise in classifying
image data using MultiSpec.
In this tutorial, you will:
1) utilize the software to automatically create an unsupervised classification
based solely on pixel characteristics.
2) manually train the software to recognize regions (AOI's) that contain
known attributes on the ground, therby producing a supervised classification.
Overview:
Each pixel in your LandSat TM image contains a wealth of information about
the surface materials that reflected light from that pixel to the satellite
sensors. Each pixel contains a value which can range from 0 to 255, for
each TM band supplied with your image. If, for instance, your image contains
data for five bands, then each pixel contains five pieces of data, each
potentially ranging from 0 to 255, as shown in the sample pixel diagram
below.
This means that your image could contain 2565 (that's
approximately 1.1 billion) different possible spectral combinations. Each
of these combinations does not represent a different type of land
cover; most of these variations represent very small and, to us, "unseeable"
differences in surface reflectance.
In most instances, your computer monitor will be displaying only 256
different colors, hence only 256 different pixels. Even set to "thousands"
of colors, only a small part of the many different pixels can be displayed.
Even if a monitor could display all the different possible pixels, your
eyes could recognize only a small number of differences in their appearance.
Because there is a limited number of different land cover types (the
Modified UNESCO Classifications scheme, MUC, contains about 157 different
types), and no GLOBE study site will have all of those different land cover
types, it is necessary to group pixels together into a smaller number of
closely related "classes." This process, whereby pixels with similar spectral
characteristics are grouped, is called "Classification," and is done in
two different ways.
In a supervised classification, you "train" the software to recognize
that certain types of pixels represent specific land cover types. This
is done on the basis of your knowledge of your own area, and field work
you may do. The software then classifies the pixels of your image into
the groups you have specified.
In an unsupervised classification, or "Clustering", we enter the number
of groups, or "clusters," we wish to have, and certain other specifications.
The software then examines the pixels in the image and groups them according
to similar spectral characteristics. These groupings are not made on the
basis of land cover, but on the similarity of the spectral characteristics
of the pixels.
Part 1: Unsupervised Classification or "Clustering"
To demonstrate clustering, you will use subscene of Path 26, Row 41 image
centered on Corpus Christi, Texas. This 600 x 600 pixel subscene will allow
the demonstration process to proceed more quickly than the clustering of
a larger image, and will allow you to follow exactly the steps outlined
in this tutorial.
-
Launch MultiSpec and Open the corpus_christi_600X600.lan
image. This is a subscene of a 7channel, multispectral, Landsat Thematic
Mapper image.
-
At this point, select Project, Close Project to close all open
projects. (In later sections of this exercise you will be prompted to save
an untitled project. Select No and continue with your work.
We won't get into the problems of dealing with MultiSpec's project capabilities.)
-
From the Processor menu, select Cluster.... "Clustering"
is MultiSpec's terminology for an Unsupervised Classification. The Set
Cluster Specifications window opens. It is in this window that you
select a clustering "algorithm' (method by which the software clusters)
and enter certain values for the software to use.
-
First, click the Image Area button to select it. Verify that
the area to classify indicates 1(start), 601 (end), 1 (interval) for both
line and column.
-
Check the Disk File box. This saves your project to disk.
-
Leave all other choices as shown.
-
Lastly, click the ISODATA button, as indicated by the cursor in
the diagram above. ISODATA is the algorithm, or mathematical process,
that MultiSpec will use in the clustering process.
A new window, the Set ISODATA Cluster Specifications window will
open. It is in this window that you tell MultiSpec how you want the
clustering to proceed. The information you need to provide is:
-
Be certain that the Image Area radio button is checked, as shown
above, and that 1, 601, 1 are shown for both line and column, as before.
-
Select "Along first cov. eigenvector." This is the specific algorithm
that MultiSpec will use in its clustering
-
Leave the settings in the Other options boxes unchanged for this
exercise.
Notes:
For a discussion of MultiSpec's algorithms, see "An Introduction to
MultiSpec," by David Landgrebe and Larry Biehl, Purdue Research Foundation,
1995. This document may be downloaded from the Purdue/LARS WWW site at:
http://dynamo.ecn.purdue.edu/~biehl/MultiSpec/documentation.html
"Number of clusters" tells the software how many different
groups you wish for the classification. The number 10 is used, for now,
because we are clustering a small area.
During the classification, the program goes through the data over and
over. This is called "iteration." Each iteration is called
a "pass". The system makes "passes" through the image until
a preset percentage of the pixels in the image are not changed during
the pass. The clustering then ends. This percentage is called the "Convergence."
"Minimum cluster size" tells the system the smallest sized area
to work with. Areas smaller than this minimum size will not be clustered.
After you have made these settings, click 0K.
-
The Set Cluster Specifications window appears again.
In the lower left-hand corner of the box is the Classification
threshold: entry box. Change the value in this box to "100".
Setting this "threshold" value to 100 forces the system to
assign every pixel in the image to one of the clusters. A value of less
than 100 specifies the tolerance for assignment of pixels. A value of less
than 100 will result in some pixels not being assigned to clusters. In
this clustering, you are interested in large, fairly homogeneous areas,
so individual pixels of slightly different spectral characteristics dotting
the map are unnecessary.
-
Click 0K.
-
The Save Project
-
The Save Report/Map As: dialog box appears. There is a default
name for your classified image file UntitledProject.clu. You should
change the Untitled Project portion to cc_unsupervised, and leave
the .clu extension to tell you and the system what type of file
this is.
-
The system then makes its first pass through the image to initially determine
the clusters present as shown in the Status box.
-
The "Pass 1" clustering Status box then appears, as shown below. During
this initial iteration, Pass 1, the "Percent of Pixels Not Changed" shows
no value. Also note that a time is given for completion of this operation.
-
The Percent of Pixels Not Changed entry does not change until the
end of Pass 2. During Pass 3, the value will be displayed, as shown below,
along with a time to completion of the pass.
-
During subsequent passes, the "Percentage of Pixels Not Changed"
increases, until it reaches the value given in the "Convergence (%)"
specification. The time for each pass to be completed is given in this
window.
You can expect the system to make several passes to achieve a 98% Convergence.
The time required for this process is dependent upon the processing speed
of your computer. On a 386-based machine (the minimum processor requirement),
you can expect the entire process to take about 5 minutes for this "sub"
image. On a Pentium machine, the process is done very quickly. A full 512
pixel by 512 pixel GLOBE Study Site image will take longer to process.
-
If you press "Cancel" during a pass, you will be asked if you wish
to cancel immediately, or complete the iteration. Canceling immediately
terminates the clustering, while finishing the iteration ends the clustering
at a Convergence less than initially specified.
-
After the clusters are determined, the system will display the "Classifying
Selected Image Area" window, below. Here the system assigns individual
image pixels to the clusters it has determined.
-
After the clustering is complete, you will see the Saving Statistics
in Project File window. Clustering is now complete.
The Results of Clustering
There are two results of clustering:
-
A description of clustering activity and a "text map" in the TEXT
OUTPUT window,
-
A clustered Thematic image.
-
From the Windows menu, select untitled project. Scroll to
the top of this "text window", and you will have statistics describing
the clustering and its results. A sample text output (not the corpus_christi_600X600.lan
image) is shown below. Listed are the number of clusters produced and the
average value (mean) of the pixel values for each band in each of the classes.
Also produced is a text map of the clustered area. The system assigns
a number or letter to each of the clusters, and then displays a map of
the clustered area using this code. A sample code is shown below.
A sample portion of the Text Map from the Clustering process. Each number/letter
represents a pixel and the clustered group to which it belongs. You can
see, even from this representation, that the system has identified several
large, homogeneous areas, identified by the appearance of the same letter/number
in an area.
-
In this sample of the Massachuttes coastline (beverly.lan), you can see
that the clustering process has used the letter "A" to represent the ocean
area (assigned cluster no. 10, as in the cluster classes diagram, shown
previously).
Examining the Clustered Image
-
From the File menu, select Open Image.
-
Select the .clu file name you used earlier, and click Open.
-
The Set Thematic Display Specifications window opens. You can experiment
later with some of the other palettes in this menu, but for now accept
the default settings and press 0K.
-
Notice that there are 10 numbered classes, plus a class labeled "Thresholded."
This "thresholded" class contains no pixels, because you set the "Thresholding
" value to 100, earlier in this exercise. Each class is assigned a color
by the system which has nothing whatsoever to do with what the cluster
represents. The clusters are produced and arranged in order of descending
level of brightness. That is, clusters near the top of the list represent
surface materials that are "brighter' (have greater reflectance) than those
near the bottom of the list.
-
You can change the color used to represent a cluster by double clicking
with the left mouse button and assigning your color choice.
-
You may print the image from the File menu. When you do, the clustering
key will be printed along with the image.
-
You may use some of MultiSpec's regular tools with this Thematic Map. Such
tools as: the Zoom feature, and Coordinate Bar, from the
View menu, function normally. The New Selection Graph
feature will show a plot with only one piece of data. This map is no longer
"multispectral." Each pixel no longer contains data for different LandSat
bands, or channels. Each pixel contains ony one value, which identifies
its color.
-
If you do a clustering with a larger number of classes, you may not be
able to see them all in the "Classes" column. To scroll through
this column: Move your cursor into the column, hold the mouse
button down, and drag to either the top or bottom of the column.
The classes will scroll up and down.
-
You and your students will probably want to prepare a thematic map from
this clustered image in which you identify some of the clustered areas
by their actual land cover. To do this, you may save the image as a TIFF
file from the File menu. This process does not save the clustering
key, only the image area will be saved. The TIFF file may then be brought
into any one of a number of paint or draw programs to be "fancied up" as
a thematic map.
-
If you wish to have an image that contains the clustering key, and can
also be moved into paint or draw programs you can capture the entire screen
using one of a variety of "screen capture" programs that are available
in the public domain or as "shareware." You will want to examine the features
of these to determine that they save "captures" in a format that can be
read by your paint/draw program.
At this point, you will need to close all projects and images that are
currently open in MultiSpec.
Select Project, Close Project, answer yes to "Save
UntitledProject.Prj, and save as cc_unsupervised.Prj. Remember,
you have already saved your classification image as cc_unsuper.clu.
Next, clear the text output window by selecting Edit, Select
All Text.., hit the delete key.
Part 2: Supervised Classification
To demonstrate supervised classification, you will use the same subscene
of Path 26, Row 41 image centered on Corpus Christi, Texas.
-
Open the corpus_christi_600X600.lan image.
-
From the Project menu, select New Project.
-
From the Processor menu, select Statistics.... This
will bring up the Set Project Options dialog window. Select OK.
-
The Select Field window will appear. It is here that you will
assign names to your training fields.
The object here is to select large areas of homogeneous pixel "groups"
that you will assign as training fields. For example, the image contains
large bodies of water that could be classed as "water" in a training field.
To do this,
-
First, in the Select Field box, select "New" in the Class
Field box.
-
Next, select a large body of water in an offshore region (lower right corner)
by dragging a rectangle over the area with the left mouse button and release.
-
Select Add to List.
-
The Define Class and/or Field Description box appears. With
Class 1 highlighted, enter a new name to represent your training area,
such as "water". You might have realized that there are differences
in the water bodies on the image. Offshore areas are deep and clear,
nearshore areas are shallow and turbid. You may wish to differentiate
between the two types by indicating two separate water classes, deep water
and shallow water.
-
Select OK. You selection will be identified on the original
image by the field name you gave to it, and by a Field number.
-
The more fields that you have in a particular class, the better your classification
results will be. You can identify more of the same type of training areas
simply by selecting more rectangles of each, adding each to the list, and
selecting OK. Do not change the class name... the software will automatically
assign a new field number to the same class if you don't change it.
-
In the Select Field box, select "New" in the Class Field
box.
-
Next, select a dense area of vegitation.
-
Select Add to List.
-
The Define Class and/or Field Description box appears again.
With Class 2 highlighted, enter a new name to represent your training area,
such as vegitation. Like the water, you may have noticed different
"levels" of vegitation on the image. From native grasses, to dense
groves of tree. Some differentiation of vegitation types would be
useful. Once again, more training fields per class will improve your
results.
***It is important that you try to identify as many KNOWN classes
as possible, and use multiple training fields for each. If you are
unsure of what the pixel represents, it's probably not a good idea to use
it as a training class. Keep it simple.
Once you are satisfied with your selection of training fields, you can
classify the image. At this point it is advisable to save your project.
Select Save Project As... and name your project cc_supervised.Prj.
-
Select Processor, and choose Classify...
-
Leave all selections as is, except check "Write classification results
to disk file", and save as an ERDAS.GIS type file
-
Choose OK to update project statistics.
-
Save your classification as cc_supervised.GIS.
-
Classification routine will start. When finished, open your
new cc_supervised.GIS file.
How Valid are these Classification Processes?
It is necessary for you to be confident that the process of "unsupervised
classification" actually yields clusters that are related to land cover
types, just as the supervised classification does with manual training
of known regions. To compare the results of each classification process:
-
open your unsupervised classification image by File, Open Image...,
select cc_unsupervised.clu
-
next, open your supervised classification image by File, Open
Image..., select cc_supervised.GIS
-
Resize the images and arrange them side-by-side on the screen, if possible.
-
Compare the areas identified in the supervised classification (.GIS image)
to the clusters produced by the system in your unsupervised clustering
(.clu image.)
You should see that the unsupervised clustering provides, at least in this
case, a good indication of the locations of large areas of uniform land
cover that could be investigated for verification studies.
Desk Verification
It is important to use as much data as you can obtain to validate your
classification schemes. The desk verification process could involve
the use of local maps (topographic, land cover, soil, political, etc.),
other local references (aerial photos, people, agencies, etc.) and the
combined experiences of both you and your students to identify some of
the clusters produced by MultiSpec. Use whatever resources you can to identify
your classifications
Field Verification
If there are clusters that you cannot identify "from the desk," you will
have to go out into the field to determine what they are. Ground truthing
is an integral part of remote sensing and should be done to verify and
validate your classification processes, supervised or unsupervised.
Renaming the Clusters
In your unsupervised classification, the software produced clusters
identified only by a number, and arranged in order of decreasing brightness.
Once you have identified the land cover for each of these clusters, your
Thematic Map display may be customized to show these clusters either by
name or by MUC identification code. You can, in effect, produce
two different Thematic Maps on the same image; one in which each cluster
is identified by a name (e.g. Ocean, Transportation) and the other by MUC
disignations (e.g. 72, 93.)
The secret to this process is that your Thematic Map can display
both "Groups" and "Classes." When it is produced, both "Groups"
and "Classes" have the same set of colors and labels. To see this:
-
Click on the Classes pull-down menu, as shown below.
-
The pull-down menu will show the choices illustrated below.
-
Select "Groups/Classes." then immediately pull down the window again
and select "Groups."
-
You can now switch between "Groups" and "Classes, " you will
see that the information in each view is identical.
You might decide that the "Groups" will contain descriptive names, while
"Classes" contains MUC labes.
To change the name of a cluster in either view, at any time:
-
Double Click on the cluster name (for example, "Cluster 1")
-
The Edit Thematic Class Name diaglog box, as shown above, opens.
-
You may now enter either a descriptive name or MUC identification number
for this class.
-
Once you enter a descriptive name in, say, "Groups," use the pull-down
menu to select "Classes" and enter the corresponding MUC identification
number for that same Cluster.
Once you have entered this data, you should save your work. Saving
your data can be done in a number of ways.
-
If your are working with a single .clu image, simply close it (X
in the upper right corner) and save both group and class information when
prompted. You will save information to your named file, except with
a .trl extension.
-
If your are working with a project file, select File, select Save
Thematic Group Info, select Project, select Save Project, and
save as you did above.
You can return to either of your classification images at any time to change
descriptive information.