REPRESENTING REALITY IN A GIS AND DATA INPUT



A. INTRODUCTION

The contents of a spatial database represent a particular view of the world based on the needs and perceptions of the individual or organization. This section will describe various ways of depicting reality in a GIS and different methods used to capture that information in digital form.

A basic concept is that measurements and samples contained in the database must present as complete and accurate a view of the world as possible. Therefore, the contents of the database must be relevant in terms of (1) Themes and characteristics of each layer, (2) The time period covered, and (3) The study area

features in the real world can be observed in three ways: spatial, temporal and thematic

The spatial mode deals with variation from place to place
The temporal mode deals with variation from time to time
The thematic mode deals with variation from one characteristic to another (one layer to another)




B. REALITY?

Reality is represented in a GIS in two ways: as continuous values constantly changing as the terrain changes or as discrete objects that have definite boundaries.

Discrete objects are represented as function of scale. As we discussed the decision to represent a building as a point that defines location or as a polygon that defines shape and location is dependent on scale. In all cases, discrete points, lines and polygons are used to represent that reality by identifying location, length, or shape. Once the boundary is crossed, that feature ceases to occupy that space.

The graphic below shows road features stored in a GIS. These roads are discrete and occur in the landscape as evidenced by the digital orthophotoquad of the same area.




We have chosen to represent the roads as a single line based on the scale. In reality roads have width and length. In the GIS these roads have only length and width can be represented as an non-graphic attribute

Some features that are commonly represented as discrete objects do not occur in the real world. Contour lines are a common method of representing elevation. These lines do not occur in the real world and represent an abstract of reality to depict variation in a discrete form.




CONTINUOUS VARIATION

Features like elevation, atmospheric temperature, and barometric pressur exist everywhere and vary continuously over the earth's surface.



We can represent such variation as an image made up of pixels where change is represented by varying the value of the pixel. In the case of the elevation dataset above, brighter values depict increasing elevation.

Creating such a dataset requires sampling at various intervals. The sampling interval determines the scale of the dataset.

This type of representation of terrain features is always approximate and can never account for all the variation. In the case of the elevation data above, the grain (resolution) of this data is 30 meters on the ground surface. All variation below this resolution is lost.





SCALES OF MEASUREMENT

Layers in a GIS can be attributed using four scales of measurement nominal, ordinal, interval, and ratio.

NOMINAL - Names or labels, if numbers are used the values have no mathematical relationship.

ORDINAL - Attributes are ordered in sequence, there is no mathematical relationaship but we can say that 2 is greater (or better) than 1.

INTERVAL - On interval scales, the difference (interval) between numbers is meaningful mathematically, but the numbering scale does not start at 0. Temperature scales (celcius and farenheight) are good examples.

RATIO - On a ratio scale, measurement starts at zero and the difference between numbers is significant. Using the same temperature scale example, degrees kelvin starts at a 0 point (absolute 0).

Features on a landscape that are represented in a discrete manner are usually represented with nominal or ordinal measurements (i.e. conifer, house, road, etc.) When the landscape is represented in a continuous fashion an interval or ratio scale is usually used (i.e. temperature, elevation, etc.)




GIS DATA INPUT

Data input is far and away the single most time consuming exercise while operating a GIS. It forms the major bottleneck and consumes over 80% of available funds. It may often seem that database construction is never ending and users never reach the analysis point. In cases like this project design and management must identify critical layers that need to be generated and levels of accuracy that must be reached in order to perform analysis

There have been many successful efforts aimed at automating the manual digitizing process including scanning and analysis of remote sensing imagery. However, the cost of automation should be weighed against the cost of manual digitizing. In some respects it may be cheaper and faster to manually digitize a map.

The best way to avoid the data input bottle neck is to use data already generated by a third parties such as federal and state agencies. Data sharing is becoming commonplace in the GIS world and is fostering a new era of data standardization and the generation of metadata (data on data) to describe the lineage and estimated accuracy of GIS layers. It is important to understand that using data generated by a third party must meet the needs of the GIS task. If data at the proper scale and type is not available, the manual digitizing process is still the only way to input data.




MODES OF DATA INPUT

Keyboard entry for non-spatial attributes and occasionally locational data

Manual locating devices (E.G. Digitizers and Computer Mouse)

Automated devices (E.G. Scanning)

Conversion directly from other digital sources



B. DIGITIZERS

Digitizers are the most common device for extracting spatial information from maps and photographs

HARDWARE
A digitizer consists of a tablet underlain with a wire mesh. The tablet can be of various sizes to fit a variety of map sheets. By attaching a map sheet onto the tablet (over the wire mesh) and tracing lines or points on the map with a stylus a user can input spatial information. The digitizer records the proximity between the stylus and the wire mesh using a magnetic field. This proximity is interpolated into x,y coordinate pairs and the position is transfered to the computer. The digitizer literally plays the game of "connect-the-dots" between the stylus and the wire mesh. When a map is positioned between the wire mesh and the stylus, the connected dots form the outline (or position) of the features on the map.





THE DIGITIZING OPERATION

The map is attached to the digitizing table

Four or more control points ("reference points", "tics", etc.) are located on known locations on the map an digitized. These reference points are tipically intersections of latitude/longitude lines, intersections of eastings and northings, or any feature on the map whose exact geographic position can be determined.

The control points are used by the system to calculate the necessary mathematical transformations to convert all digitizer coordinates to final geographic coordinate system


PROBLEMS WITH DIGITIZING MAPS

Most maps are generated for the purpose of displaying information to the user and do not always depict the spatial location of objects exactly. Further, maps made of paper are succeptible to shrink and swell thus altering the spatial relationships of features on that map. The best map base to digitize from consists of mylar (plastic base) material that is more stable.
Most digitizing errors can be attributed to poor map bases and scale. Human error is also a concern and can cause significant error depending on a number of factors that influence the ability to trace lines on a consistent basis for long periods of time.

Therefore, the accuracy of any GIS database is directly related to the quality of the digitizing process.





C. SCANNERS

Scanners are a common item in most computer facilities and allow users to input graphic and text information directly into a computer. Scanners are used in GIS to input map and photo information and the quality of this information is related to the quality of the scanner and the quality of the base map being scanned.

Most desktop scanners give users an inexpensive method of data input for small maps. However, most of these scanners are not designed for accurate planimetric application and are succeptible to distortions along the edge of the scan area.

Scanners designed for planimetric work are often very expensive but can reduce the cost of data input significantly. These scanners consists of flatbed and drum scanners that can input map information from a variety of media sizes.

While the scanner offers a quick solution to data input, the data cleanup process may offeset any cost savings over conventional digitizing. To be effective, the map to be scanned should consist of only features that are to be input. Most maps include graphic and text features such as contour lines, roads, building locations, rivers, and text. Scanning a map that contains multiple features creates a data cleanup problem since it is desirable that each of these features be located in a separate database.

The map to be scanned must be clean. Lines on the maps should be wide enough and with enough contrast to be detected by the scanner. Maps should be clear of text and other graphical features that are not required.

Following the scanning process, the map is stored in a raster format with pixels representing the location of features. If a vector product is required, line tracing algorithms must be used to convert the raster to vector. This process may input error depending on the quality of the line-trace algorithm.





D. CONVERSION FROM OTHER DIGITAL SOURCES

More and more data is becoming available in magnetic media

USGS digital cartographic data (dlg's - digital line graphs)
Digital elevation models (dems)
Tiger and other census related data
Data from cad/cam systems (autocad, dxf)
Data from other gis

These data generally are supplied on digital tapes or cd rom technology that must be read into the computer





GLOBAL POSITIONING SYSTEM (GPS)

GPS receivers are becoming very popular as input devices for GIS. The GPS computes geographic position with relative accuracy using a constellation of 24 satellites orbiting above the earth. This constellation allows a GPS to located itself anywhere on the surface of the earth.

Go back to previous page

Go back to GER home page

Author: R. Douglas Ramsey Doug@nr.usu.edu