A Comparison of Camp W.G. Williams and Gap Analysis Vegetation Classifications

Abstract: A comparison of the Army National Guard Bureau's Camp W.G. Williams vegetation classification and the National Biological Service's Gap Analysis (GAP) vegetation classification was performed using a cross-walk of 9 classes. Both Williams and GAP were classified from Landsat Thematic Mapper (TM) imagery, with minimum mapping units (MMU) of .09 hectares and 100 hectares and overall accuracy of 89% and 75% respectively (Edwards et. al., 1995; Van Neil, 1995). The Williams classification was generalized to 1 hectare, and the two coverages were combined into a single coverage for comparison. The Oakbrush, Sagebrush, Agriculture, and Urban classes showed good agreement (60-80%), whereas the Juniper, Bare ground/Annual weeds, Riparian, Salt desert scrub, and Pinion-Juniper classes showed little or no agreement.

An unsupervised classification (MMU .09 ha) of four principle components was also performed using the same 1993 TM scene used for the Williams classification. Only four classes were delineated: Oakbrush, Sagebrush, Juniper, and Other. An overlay of all three datasets for the Oakbrush class revealed a near exact agreement between the principle components and Williams classifications with spotty inclusion errors in the GAP dataset probably due to generalization and a large area of exclusion attributable to a fire occurrence between imagery dates (GAP 1988 and Williams 1993).

Introduction:

Camp W.G. Williams is a National Gaurd Bureau training installation located west of Salt Lake City, Utah in the Salt Lake and Utah counties. The area is 9,712 hectares (24,000 acres) comprised of hilly terrain and vegetation primarily composed of Quercus gambelii (oakbrush) and Artemisa tridentata (sagebrush). Two vegetation classifications have been performed over the Camp Williams area at separate resolutions and separate time periods. The Camp Williams vegetation classification was created by Tom Van Neil at a minimum mapping unit (MMU) of .09 hectares using 1993 Thematic Mapper (TM) imagery. The dataset has 10 classes and yeilds an overall accuracy of 89% (Van Neil, 1995). The second vegetation classification was created principly by Collin Homer and the Utah Gap Analysis team. The Gap Analysis (GAP) classification was created over the entire state of Utah using 1988 TM imagery and has a MMU of 100 hectares. The GAP dataset has 11 classes that fall within the Camp Williams boundary, and yields a 76% overall accuracy for the Basin and Range ecosystem which encompasses Camp Williams.

The goal of this research is two-fold. The first is simply to identify the the amount of agreement between the two datasets, which is accomplished through both qualitative and quantitative methodologies. The second goal is to identify the impact of the MMU and of separate classification dates. This is accomplished by reclassifying the 1993 imagery at the .09 hectare MMU using a technique independent of either classification.

Methods:

Coverage Acquisition:
Four coverages are required for the analysis: the Camp Williams boundary, vegetation classification, and TM scene, and the Gap Analysis vegetation classification. The Camp Williams boundary and vegetation classification coverages were obtained and their projections defined as UTM zone 12. Two 1:100,000 scale Gap Analysis quadrangle tiles were obtained, joined and clipped to the Camp Williams boundary. Finally, the 1993 TM scene used in the Williams classification was obtained in ERDAS (.lan) format. As no image clip routine exists in ARC/INFO, and aml was created. The aml converts the image to a GRID using IMAGEGRID, clips the image using LATTICECLIP, and converts the GRID back to an image using GRIDIMAGE.

Coverage Integration:
The next step is to generalize the Williams .09 hectare GRID coverage into 1 hectare cells. The smart-raster-eliminate (Bassett, 1995) is used with an ecologically based elimination matrix:

12
12  7  6 11  9 10  1  8  5  1 -1  1
 7 12  8  9 11  3  3  3 10  3 -1  3
11  5 12  5  5  5  5  5  5  5 -1 10
11  9  5 12 11 10  5  5  5  5 -1  5
 5  5  5 10 12 11  5  5  9  5 -1  5
 5  5  5  5 11 12  5  5 10  5 -1  5
 5  5  5  5  5  5 12 11  5  5 -1  5
 5  5  5  5  5 10  5  5 12 11 -1  9
 5  5  5  5  5  5  5  5 11 12 -1 10
 5  5  5  5  5  5  5  5  5  5 12  5
10  5  10   5  5  5  5  10 10 -1 12
The matrix contains as many rows and columns as there are vegetation classes in the GRID. In this case, there are 12 vegetation classes, and therefore 12 rows and columns. Each row displays a ranking of ecologically similar cover types. Cover types that are ecologically similar are placed higher in the ranking. For example, if row 1 represents the "Oakbrush" class, column 4 "Oakbrush, Sagebrush, Scrub", and column 12 "Riparian", then column 4 has a higher ranking, "11" as compared to column 12 with a ranking of "1", since column 4 is ecologically closer to row 1, "Oakbrush". Classes that cannot be expanded such as "Cloud" are given a value of "-1". The matrix is then utilized by the program to determine how to merge polygons less than the specified minimum area. The program is run twice, once with a 4x4 filter, and again with a 9x9 filter. For each filter, the program is run in steps, doubling the minimum area each time until the chosen MMU is obtained. For Camp Williams, the steps ran from 2 pixels to 4 to 8 to the 11 pixel MMU. The process ran for 5 minutes in a batch job. For more information, see Bassett, 1995.

After the GRID is generalized to the MMU, it is then polygonized using the GRIDPOLY command in ARC. If the GRID is not in integer format, it first needs to be converted inside GRID using the <in-grid> = int(<out-grid>) command.

Once both coverages are in vector polygon format, they can be merged using the UNION command. This creates a single coverage where each polygon contains an attribute from both parent coverages. Slivers less than the MMU are created in the merge, and can either be deleted in ARC/EDIT, or RESELECTed out when computing statistics.

Next, a cross-walk between the two vegetation classifications is created. The cross-walk attempts to provide a reasonable basis for comparison between to separate classification systems. The cross-walk used for this comparison is as follows:

Camp Williams			Gap Analysis			Cross-walk
---------------------------------------------------------------------------------------
Oakbrush			Oak				Oakbrush
Oakbrush, Sagebrush, Grass	Mountain shrub
---------------------------------------------------------------------------------------
Juniper				Juniper				Juniper
---------------------------------------------------------------------------------------
Sagebrush			Sagebrush			Sagebrush
Grass, Sparse shrub		Sagebrush, Perennial grass
---------------------------------------------------------------------------------------
Vegetated Agriculture		Agriculture			Agriculture
Barren Agriculture
---------------------------------------------------------------------------------------
Riparian			Lowland riparian		Riparian
				Highland riparian
---------------------------------------------------------------------------------------
Barren or Annual weeds		n/a				Barren or Annual Weeds
---------------------------------------------------------------------------------------
Urban				Urban				Urban
---------------------------------------------------------------------------------------
n/a				Salt Desert Scrub		Salt Desert Scrub
---------------------------------------------------------------------------------------
n/a				Pinon-Juniper			Pinon-Juniper
---------------------------------------------------------------------------------------
After the cross-walk system has been established, two items are added the union coverage that represent each parent coverage, "xwalk1" and "xwalk2". Each parent attribute is then RESELECTed, and the corresponding "xwalk" item is CALCULATEd to match the cross-walk value. Thus, the Williams parent attributes vegetated and barren agriculture would be RESELECTed and the "xwalk1" attribute CALCULATEd as agriculture.

Coverage Comparison:
Now the coverages are ready to be compared. Two methods are incorporated for the comparison: a qualitative method, and a quantitative method. The qualitative method involves displaying each class on a map, and the quantitative methods involves computing statistics.

For the qualitative comparison, nine maps are created - one for each cross-walk class. For each class, "xwalk1" is RESELECTed and drawn in white. The union coverage is then ASELECTed, and the identical class in "xwalk2" is RESELECTed and drawn in green. Next, the same class in "xwalk1" is again RESELECTed and drawn in red. Thus for each class, parent one only is drawn in white, parent two only is drawn in green, and areas in agreement are drawn in red. An example is shown here for the Oakbrush class. Williams oakbrush is shown in white, Gap Analysis oakbrush is shown in green, and areas of agreement are shown in red.

For the quantitative comparison, statistics are calculated in a similar way. For each RESELECTion, the STATISTICS command is run with the SUM AREA option, and the AREA is appended to a file. The file is then imported into a spreadsheet and three percentages are calculated for each class: "area both" divided by "area of parent 1", "area both" divided by "area of parent 2", and "area both" divided by "total area". The first two percentages show a percent agreement for each parent and the last percentage shows a percent agreement for both based on the total area classified by both. The following statistics were calculated for the comparison:

Class	GAP Area (m2)	Both/GAP	WIL Area (m2)	Both/WIL	BOTH Area (m2)	Both/Total
1 	23594436.81093 	0.828036087 	32058602.69055 	0.609416615 	19537045.12651 	0.540952713 
2 	9827524.503621 	0.22897877 	4469108.777856 	0.503521974 	2250294.474337 	0.186803186 
3 	2486394.670363 	0.625633514 	3153334.058249 	0.493310194 	1555571.834741 	0.380879549 
5 	52716703.80796 	0.41290365 	29130536.22549 	0.747220004 	21766919.40199 	0.362296991 
8 	1886368.131069 	0 		95557.20440375 	0 		0 		0 
9 	0 		0 		2792867.541231 	0 		0 		0 
12 	958227.1636927 	0.540317142 	733641.875 	0.705721116 	517746.5625 	0.440964697 
32 	1495989.31243 	0 		0 		0 		0 		0 
33 	1439773.646896 	0 		0 		0 		0 		0 
Image Classification:
An unsupervised classification of four principle components is performed on the 1993 TM image. Using IMAGINE 8.0, a principle components analysis is performed by selecting the "Interpreter" button, then the "Spectral Enhancement" button, and finally the "Principle Components Analysis" button. Four principle components are chosen for the output. Next, an unsupervised classification is performed on the four principle components with 20 output classes using a convergence threshold of .95, and a maximum of 10 iterations. The classification is then converted to a GRID using IMAGEGRID, and the GRID is converted to vector polygons using GRIDPOLY. The classified image is then shaded with POLYGONSHADE, and for each cross-walk class the area of agreement in the union coverage is overlayed. The IDENTIFY command is used to find which of the 20 classes in the classified image correspond to the overlayed cross-walk class. These classes are then RESLECTed and the "xwalk" item is CALCULATEd to the appropriate class. Here is the principle components classification with four classes: brown is oakbrush, olive green is juniper, green is sagebrush, and black is other. Finally, the principle components classification is overlayed with the union coverage to visually show agreement. Here shows agreement for the Oakbrush class. The Gap Analysis classification is drawn in green, the principle components classification is drawn in blue hatch, and the Williams classification is drawn in white hatch. The red polygons show the areas where fire occurred between the Gap Analysis and Camp Williams classifications.

Discussion

Generalization:
It was deemed necessary to generalize the Camp Williams vegetation classification from .09 hectares to 1 hectare for multiple reasons. First, it is improbable that management decisions will be made concerning areas less than 1 hectare. Second, misregistration error can offset a scene such that areas smaller than 1 hectare misrepresent a significant portion of their true surficial extent. Third, generalization to 1 hectare significantly reduces the number of polygons thus improving data handling.

Several generalization commands exist within Arc/Info (MERGE, GENERALIZE), however none of them incorporate ecological similarity in cover type. The routines simply merge a polygon with the neighbor that shares the longest border. The neighbor may or may not be a reasonable match. For example, a .5 hectare sagebrush polygon may share a border with a cloud polygon and a grass, sparse shrub polygon. If the cloud polygon shares a longer border, the sagebrush will be subsumed into the cloud polygon even though the cloud is an ecologically separate entity. By using the Smart-Raster-Eliminate program (Bassett, 1995) it is possible to weight class types so they are subsumed into ecologically related neighbors. The weighting matrix utilized was carefully evaluated by Tom Van Neil who performed the Camp Williams vegetation classification.

Cross-walking: Camp Williams and Gap Analysis utilized two separate classification conventions. The Camp Williams classification was primarily driven by the specific vegetation associations at the installation, whereas the Gap Analysis classification was labeled using the UNESCO system as required for National Gap Analysis standards. Because of the two systems, a direct cross-walk could not be established. Thus it was necessary to choose a grouping that made the most ecological sense. Grouping the classes becomes difficult when one class in the first system closely corresponds to two or more classes in the other system. In this case two possible solutions arise. The first is to cross-walk the one class to both of the other classes, and the second is to cross-walk the two other classes to the one class. As an example, Williams classifies agriculture into "Vegetated" and "Non-vegetated", whereas GAP only specifies "Agriculture". The first solution is to classify GAP "Agriculture" as both "Vegetated" and "Non-vegetated", whereas the second solution is to group the Williams "Vegetated" and "Non-vegetated" into one class - "Agriculture". The first solution is beneficial since it maintains the original number of input classes, but the result is a duplication in aerial extent. One class is used in two separate comparisons. The latter solution was chosen to eliminate this problem, but the tradeoff is a reduction in the number of classes since classes from both sides must be subsumed to establish a reasonable comparison. The cross-walk began with 10 Williams classes and 11 GAP classes, and ended with 9 cross-walk classes.

Comparisons: Two comparisons are chosen for optimum comparison of the datasets. Due to the spatial quality of the data, a purely statistical representation misses a great deal of spatial information, while a purely visual comparison lacks numerical precision. For the visual comparisons, a Venn diagram is the most intuitive. A Venn diagram comparison of two entities can be shown by utilizing only four colors. Shown below is a Venn diagram representation of the visual comparison. The two ovals represent each classification. White represents Camp Williams, green Gap Analysis, red both, and black neither.

The statistical comparison is also chosen for its simplicity. By necessity, it is assumed that the area identified by both classifications is correct. This assumption is realistic as two independent classifications are in agreement. However, it does not take into account temporal changes in land cover such that each classification may have correctly classified other areas but in the time between the classifications, the land cover may have changed. Thus the assumption is conservative in that it underrepresents the true accuracies. Three percentages are computed based on the areas of agreement: Both/GAP, Both/Williams, and Both/Total. All three percentages are measures of relative agreement. The first two measure how much GAP and Williams overclassified individually, and the third measures how much GAP and Williams overclassified together. It is important to remember that these percentages are relative to the area of agreement, and not necessarily relative to the actual spatial extent of the cover type.

Following is a brief display of the results of both comparisons for each of the nine cover types:

Oakbrush
Map
Both/GAP .828 Both/Williams .609 Both/Total .541 Oakbrush shows the highest agreement. The Williams percentage is low due to a fire occurrence between the classification dates. Juniper
Map
Both/GAP .229 Both/Williams .504 Both/Total .187 Slightly better agreement could be obtained by including the GAP "Pinon-Juniper" class, but pinon-juniper does not exist within the Camp Williams boundary. Agriculture
Map
Both/GAP .626 Both/Williams .493 Both/Total .381 Sagebrush
Map
Both/GAP .413 Both/Williams .747 Both/Total .362 Sagebrush also shows a high agreement, and the GAP percentage is low due to the fire occurrence. (Fire burned sagebrush which was replaced with oakbrush.) Riparian
Map
Both/GAP .000 Both/Williams .000 Both/Total .000 Riparian areas were screen digitized for Gap Analysis separate from the unsupervised classification method. Riparian areas are highly dependent upon the date of the TM scene. 1988 (GAP) was a dry year, and 1993 (Williams) was a wet year. Bare ground, Annual Weeds
Map
Both/GAP .000 Both/Williams .000 Both/Total .000 This is a Williams class, and no corresponding GAP class that falls within the Camp Williams boundary applies. Urban
Map
Both/GAP .540 Both/Williams .706 Both/Total .441 Salt Desert Scrub
Map
Both/GAP .000 Both/Williams .000 Both/Total .000 Salt Desert Scrub classified by GAP does not exist within Camp Williams. Pinon-Juniper
Map
Both/GAP .000 Both/Williams .000 Both/Total .000 Pinon-juniper classified by GAP does not exist within Camp Williams.
Principle Components Classification
One final comparison was made with the datasets in order to determine the effect of temporal change in land cover independent of the classification system. The 1993 TM scene used to classify the Williams dataset was reclassified using a different classification system - principle components analysis. An overlay of the three classifications for the Oakbrush class yielded a surprising result. Here is the overlay with GAP shown in green, principle components shown in blue hatch, Williams shown in white hatch, and fire boundaries shown in red. The classifications performed with the 1993 scene (Williams and principle components) match almost exactly, whereas the Gap Analysis classification performed with 1988 imagery shows two areas of non-agreement. The first area is a set of small polygons classified by GAP but not by the other two. This is deemed to be a result of the generalization of the GAP dataset to 100 hectare polygons. The second area is a large polygon not classified by GAP, but classified as Oakbrush by the other two datasets. This difference can be explained by a fire occurrence between 1988 and 1993.

Conclusion:

The amount of agreement between the two datasets was pleasently surprising. Due to the large MMU and spatial coverage of the GAP dataset it was anticipated that there would be far less agreement than the research identified. The high agreement among the major land cover types adds to the strength of the GAP dataset. As could be expected, the flaws in the GAP classification can be attributed primarily to the 100 hectare MMU. Misclassified areas smaller than the MMU were either exluded or included into incorrect classes.

The classes that were completely misclassified were generally of small spatial extent, however the riparian class should be viewed as highly suspect. Riparian areas are highly sensitive to damage, and the fact that no overlap occurred in this class is cause for concern. The GAP classification was performed during a dry year and showed more riparian areas than the Williams classification that was performed during a wet year. This is definately an area that requires further research.

The overlay of the principle components classification with the other datasets was very intriguing. The overlay added to the strength of the MMU misclassification hypothesis, but more surprisingly it showed the strong impact that classification date has upon the dataset. The area of largest descrepency in the comparison occurrs where a fire burned in between the two classifications.

Futher research is suggested to more closely define the effect of minimum mapping unit impact, and the accuracy of the riparian areas needs to be identified.

References

Edwards, Thomas C. Jr., Collin G. Homer, Scott D. Bassett, Allan Falconer, R. Douglas Ramsey, and Doug W. Wight. 1995. Utah Gap Analysis: An Environmental Information System, Report. Utah Cooperative Fish and Wildlife Research Unit, Logan, Utah.

Van Neil, Tom. 1995. Vegetative Change Detection of Camp W.G. Williams, Utah Using Soil Adjusted Vegetation Indeces Derived From Landsat Satellite Imagery. Master's Thesis. Utah State University.