GIS - Computational Problems: § 1: GIS Overview

Instructors: G.X. Ritter and M.S. Schmalz

In the vast majority of human societies that have developed in this historical epoch, land ownership or control is or has been the primary basis for wealth. Agriculture, access to fisheries, mining, and access to shelter are all constrained by control of land. Upon these basic activities civilization is constructed, including ancient institutions such as barter markets, universities, places of worship, or seats of government as well as more modern institutions such as manufacturing, technology, or finance. Today, we denote land control by two phrases -- land ownership and regulation of land use.

Background. Different cultures have different policies and methods for regulating land use. For example, in the United States, much land is publically owned (e.g., over 70 percent of the land area in the states of Nevada and Alaska is directly owned and controlled by the Federal government). Hence, local, state, and federal governments have much to say about how land is used (e.g., zoning, pollution control, public works construction, etc.) In contrast, in the United Kingdom, where very little land is publically owned, land use policy tends to be approached more from the perspective of consensus among private individuals or organizations who may be guided by a government advisory institution, group, or individual.

In order to optimize land usage for the benefit of society as well as the individual, one must know the attributes of a given land area. For example, in agriculture one might ask if the terrain is suitable for farming, what is the climate and rainfall, and who is (are) the landowner(s)? Also, what is the proximity of one's prospective farm site to nearby markets, and what is the flow of commerce (e.g., market potential) in those areas?

To help answer such questions, the discipline of cartography (i.e., the creation and study of maps) was united with computer science subdisciplines of database design and management, computer graphics, image processing, and with the mathematical subdiscipline of spatial analysis, to produce an area of study called geographic information systems (GIS). This subdiscipline of computer science did not emerge full-grown, but evolved as computer hardware and software became more capable and available. The GIS evolutionary process took approximately 30 years to reach its present state, and geographic information systems are still in a state of adolescence, as evidenced by numerous implementational problems. For example, algorithms for database query and search are not sufficiently fast to accomodate the very large amounts of GIS data currently accumulated. Similarly, workstations generally have insufficient computational power for merging GIS datasets in real time, with high accuracy. In some cases, dataset merging algorithms are not available that perform the desired types of feature detection, recognition, and combination.

In this course, we briefly examine the history and culture of GIS, then consider a number of key computational problems. The purpose of this course is to (a) provide orientation to the concepts and general practice of GIS, (b) provide sufficient background information to facilitate analysis of theory and system performance, and (c) pose research questions and support design and development at the leading edge of GIS theory and algorithms, analysis of GIS data, and efficient, accurate GIS implementation.

Definition. A geographic information system is:

Section Overview. In this section, for purposes of orientation, we summarize basic concepts of GIS. It is important to note that this section does not have as its goal the teaching of technique for using any particular GIS software system. The material is structured as follows:

In Section 1.1, we trace the development of GIS from map databases to the present state-of-the-art systems with graphical map interfaces and integrated map imagery or positional analysis capabilities. Section 1.2 contains an overview of GIS applications, which range from resource management to determining the suitability of a given land area for setting up a particular type of business. In Section 1.3, we discuss the high-level organization of GIS systems and data, which introduces a summary of GIS software and data analysis tools in Section 1.4. The historic and detailed topic of map projections is overviewed in Section 1.5, with comments on the adaptation of classical map projection theory to the needs of GIS system designers. Resources for obtaining GIS map data and satellite/airborne imagery are discussed in Sections 1.6 and 1.7, with special emphasis devoted to on-line resources available via the World-Wide Web.

1.1. Brief History of GIS System Development.

The following synopsis is abstracted largely from Coppock and Rhind's excellent history of GIS presented in [Magu91], with details added from [Toml88].

Manual GIS systems evolved from the discipline of cartography, where architects or site designers needed to visually compare the building plan with the site survey. With more elaborate urban planning in the 19th and early 20th centuries, map overlays became popular. These were translucent paper (and later, plastic) sheets upon which maps were drawn or printed. For example, a map of site drainage could be overlaid on a topographic map of the site, or on a street plan. This could help the urban planner or site designer determine whether or not to locate houses or other buildings at a given place (e.g., close to a creek or river floodplain). The same overlay process can be used to superimpose structural data on an aerial photograph. This technique was often used by site designers or architects when presenting designs to a prospective client.

It was the experience of publishing atlases of national and international scope which convinced mapmakers that computers could provide cost-effective means of map drawing, cataloging, and analysis [Bick63]. Perhaps the earliest attempt to automate map production was the Atlas of British Flora, which used a punch-card tabulator to produce maps on preprinted paper from cards that recorded map coordinates of plant occurrences [Per62]. Although not repeated due to its primitive nature at a time when computers were evolving rapidly, this effort anticipated later (and frequent) practices of producing approximate maps via line printer. Slightly later work by Bertin (circa 1967) involved modification of IBM Selectric typewriters driven by punch-card readers to produce maps with proportional symbols.

In North America, the earliest ancestors of GIS appeared at the University of Washington in the early 1950s, where geographers and transportation engineers developed quantitative methods for analyzing transportation study data [Duek74]. In 1964, IBM introduced the System 360 computer, which was the first available general-purpose machine, having 400 times greater speed and 32 times greater storage than its predecessor, the IBM 1401. By 1965 the US Bureau of the Budget compiled an inventory of automatic data processing (ADP) in the United States (US), which noted the significant use of computers to handle land use and land title data [Cook66]. In 1967, a symposium on comprehensive unified land management systems at the University of Cincinnati was advised about usefulness and design constraints of ADP machines for these purposes.

The first demonstrations of address matching, computer mapping, and small area data analysis were provided through the 1967 New Haven Census Use Study [USBC69]. Launch of the DIME workshops in 1970 and the development and distribution of ADMATCH (address matching) software influenced the creation of the Spatially Orientated Referencing Systems Association (SORSA), which still holds international conferences. The increasing availability of computers to universities strongly motivated development of the quantitative revolution in academic geography in the early 1960s [Jame78, Huds79], particularly in the field of spatial analysis. These applications, despite their capability of handling geographic data, had little interaction with computer mapping, since the statistical methodology was primarily aspatial. An exception was the previously-mentioned making of crude maps using a line printer [Rush69].

In the 1960s, computers available to domestic government agencies and university laboratories had few graphics capabilities, generally operated in batch mode, and were quite expensive and unreliable by today's standards. Despite this, the US National Ocean Survey (NOS) was creating charts on mechanical plotters for production of bathymetric maps, and military organizations such as Rome Laboratory (then the Rome Air Development Center) and the CIA were active in this area [Diel98, Toml72]. By the end of the 1960s, map production by computer was widespread. Although little cost analysis information is available, it appears as though the automated methods were not yet competitive with manual map production [Toml85], due largely to the high cost of procuring and maintaining computer hardware.

Unlike the United Kingdom, where automated map digitization was in progress, with automated map production from 1973 onward, the United States Geological Survey's (USGS) Topographic Division did not implement automatic production of topographic maps until the 1980s, which severely hindered the development of many GIS projects in the US. However, the universities, hampered by lack of funds, continued to produce line printer maps. Although crude, this method allowed sufficient visualization capabilities that effort could be devoted to the development of primitive analysis software, which was designed to operate in conjunction with the cartographic systems.

In 1963, a grant was obtained from the Ford Foundation to initiate the Laboratory for Computer Graphics at Harvard University, where Fisher and his colleagues built a team of scientists, engineers, and programmers who eventually created SYMAP. This was a computer mapping package that could produce isoline, choropleth, and proximal maps on the line printer. It was particularly useful for analyzing census data, and was widely distributed (approximately 500 legal copies, about half of which were in universities). A subsequent package, CALFORM, which used pen plotters and produced more accurate maps, was less successful, probably due to the high cost of plotter hardware. SYMAP was the first widely distributed package for handling geographical data and introduced large numbers of users to the potential of computer mapping. Steinitz and Sinton produced a cell-based program, GRID, which permitted overlays of data in conjunction with SYMAP. The Laboratory also produced a number of professional cartographers and computer scientists who contributed to the design and construction of ODYSSEY, the prototype of contemporary vector GIS [Chri88].

At approximately the same time that Fisher was developing computer mapping at Harvard, Tomlinson (working with the Canadian government) guided the creation of what may have been the first GIS [Toml88]. Indeed, Tomlinson is often hailed as the "father of GIS", due to his persuasion of the Canadian Government that the creation of the Canada Geographic Information System (CGIS) in 1966 was worthwhile. This effort dated from the early 1960s, when Tomlinson worked for Spartan Air Services and had numerous conversations with colleagues and Department of Agriculture (DOA) administrators regarding the utility of digital computers in mapping. Tomlinson eventually worked within the Agricultural Rehabilitation and Development Administration (ARDA) which, in cooperation with IBM, led to the following significant developments in GIS technology:

The system Tomlinson helped developed was fully operational in 1971 and contained (as of 1991) over 1,000 maps on more than 100 different topics. Macguire [Macg91] states that, excluding systems based on remote sensing data and the recent TIGER system, CGIS may be the largest GIS in operation and the only one to cover a continental area in great detail. Other factors, such as land management policy in Canada, the apparently passive nature of CGIS' custodianship, and apparent lack of computer networking within the various agencies that administered CGIS tended to contribute to its under-utilization. Despite his success at ARDA, Tomlinson left in 1969 and became a private consultant in GIS, continuing to chair the International Geographic Union's Commission on Geographical Data Sensing and Processing, from its establishment in 1968 through 1980.

By 1976, there were at least 285 computer software packages that handled spatial data and were developed outside USGS [Toml76], which number rose to over 500 in 1980 [Marb80]. Due to lack of contact between software developers, much duplication resulted. Large states in the US developed their own GIS software, some using CGIS, others adapting SYMAP, and still others generating proprietary, in-house products. Tomlinson [Toml88] describes the 1970s as a period of "lateral diffusion" rather than innovation. In the defense community the CIA developed a world data bank that subsequently was made available in the public domain [Ande78]. No comprehensive history exists of these local approaches, but the Bureau of the Census and USGS have made representative progress, namely linking cover from the USGS 1:100,000 topographic maps with tracts for the 1990 census, which has motivated further GIS developments in the US [Call84].

The Census Bureau's involvement in geographical data processing began with the New Haven Census Use Study in 1967 and led to the Dual Independent Map Encoding (DIME) scheme that featured data encoding for census areas and experimental computer-generated maps of census data [Schw73]. DIME recorded the topological relationships of streets, but did not use coordinate information in its earliest versions. During 1972, the Bureau developed (with Harvard Graphics Laboratory) the Urban Atlas Project, which digitized approximately 35,000 metropolitan census tracts in a cost-effective manner. This required software development, which was supported mathematically by Corbett's paper (Corb79) on the topological principles underlying cartography and GIS. From these beginnings emerged DIME, ARITHMICON (an improved system with analytical capabilities), and TIGER, a large, comprehensive civilian GIS [Rhin91].

USGS was concurrently involved in the creation of the Geographical Information Retrieval and Analysis System (GIRAS), developed specifically for handling information on land use and land cover. Input was manually-produced maps at a scale of 1:250,000 derived from aerial photography, which are currently updated automatically using Landsat imagery [Mitc77]. As graphics hardware became available the batch-mode GIRAS-1 evolved into an interactive version (GIRAS-2).

An advanced project undertaken at the state level was the Minnesota Land Management Information System (MLMIS), which transitioned from early developments in 1976 at the University of Minnesota's Center for Urban and Regional Analysis to the state level, where it operated on a "fee for service" basis. MLMIS was based on a digital land use map of the state that was prepared from aerial photography, and was unfortunately based on a coarse grid. Despite its shortcomings, MLMIS has supported several hundred successful GIS projects in its lifetime [Robi91].

As mentioned previously, after the termination of the original Ford Foundation grant, Harvard's Graphics Laboratory produced the vector-base GIS system, ODYSSEY, which was in operation by 1979. Unfortunately, the commercial vendor, ISSCO, who contracted with the Laboratory, withdrew after early advertisement of the software, incurring heavy debt that eventually caused termination of the Laboratory. A more fortunate circumstance occurred with the establishment of the Environmental Science Research Institute by R. Dangermond in 1969 [Dang88], who eventually produced the ARC/INFO package that has become something of a standard in GIS. Intergraph, ComputerVision, and Synercom were also major players in the 1970s, and most of these approached GIS from the CAD/CAM area. However, ESRI's excellent record of high-quality products is exemplar, as discussed in the following paragraph.

ESRI began as a not-for-profit organization that developed the cell-based GRID package, which remained its main applications system until the introduction of ARC/INFO in 1982. A three dimensional version of GRID was called GRID TOPO, and in the late 1970s ESRI marketed a vector-based system called the Planning Information Overlay System (PIOS). ESRI has been the most successful vendor throughout the 1980s and 1990s, due to its ARC/INFO system, now in Revision 7 and available cross-platform for a wide variety of applications. ESRI's committment to high-quality products and their ongoing service provision have led many state and local governments to adopt ARC/INFO as their standard.

In the 1990s, GIS has matured somewhat, with research directed away from the basic issues of map production and encoding, which have been solved after a fashion. The current issues of importance are more related to engineering concerns, such as map accuracy, precise co-registration of maps and imagery, and the vexing problem of combining datasets of diverse formats. In the latter case, much data has accumulated from earlier vector-format GIS, which is difficult to accurately integrate with current GIS systems. It is issues such as these, including the troublesome problem of errors in GIS data, that we will address in this course.

1.2. Overview of GIS System Applications.

As we have seen from the historical overview, GIS have been used widely in land management, census, forestry, resource management, and other environmental areas. What has not been discussed is the utility of GIS in two key areas, namely, military and commercial applications.

In the military sector of government, one needs to answer several questions, namely,

  1. What resources are deployed over a given area?
  2. What are the temporal and spatial relationships among various resources?
  3. What are the attributes of each resource that are of salient operational interest?
Additionally, battle planners must reason over these databases to extract information that describes, for example, shortest (and safest) routes and travel times across various types of friendly or hostile terrain, probable destinations of mobile threats, or likelihood of engagement under various force projection scenarios. Military GIS systems are often integrated with battle simulation systems (wargaming software) to yield high information content in support of key tactical decisions. This capability is expected to enhance the current trend toward speedup of warfare, which will further reinforce the development of military GIS applications.

In practice, DoD's mapping effort has grown manyfold in the 1990s, and currently emphasizes terrestrial mapping at a putative resolution of one meter (although interpolation schemes have been developed that enable much greater resolution). The data storage, manipulation, and analysis technology required to support global warfare using such large-scale systems challenges the best of computer science and technology. Indeed, this thrust serves as a motivator for much of the research discussed in this course.

The second application of GIS is in commercial forecasting, for example, in mass marketing of products and services. GIS are widely available for personal computers and workstations that have large marketing databases which can be searched in a content-based fashion using graphical user interface (GUI) directed queries based on displayed map information. This fusion of spatial analysis with commercial databases is currently being accelerated by the field of database mining, which provides useful information on temporal trends by exploiting archival data acquired over periods ranging from years to decades. Since much of the census data in the US is publically available, and datasets about consumer behavior are frequently bought and sold, there are rich, extensive resources to support GIS-based commercial analyses.

As an example of different types of queries that may be issued to a GIS, consider the following query and analysis modes:

In order to better understand how GIS systems deal with such queries, we next examine GIS organization and datatypes.

1.3. Organization of GIS Systems and Data.

GIS have four primary components, namely,

  1. Data, which may be of type spatial, temporal, or attribute;

  2. Engines that perform various data storage, retrieval, analysis, reporting, and communication functions;

  3. Interfaces such as GUIs having widgets based on toolboxes such as X-Windows or MOTIF; and

  4. Hardware, including workstations and networks, disk and tape storage, digitizers, plotters, and communications devices.

Since we assume that students in this course are well acquainted with interfaces and hardware, we will concentrate first on data, then develop the theory, algorithms, and implementations for specialized GIS engines as the course progresses. Engines are further discussed in Section 1.4.

1.3.1. Data and Databases.

GIS data has traditionally been classified as raster (array-based) or vector (a line segment defined by its endpoints).

Raster data are array-based, and are able to represent a large range of computable objects, although at limited resolution. Rasters stored as uncompressed raw data can be extremely inefficient spacewise. Compression of rasters to meet feasible storage requirements increasingly involves error due to approximation in the compression and decompression transforms. This is a matter of some concern where land use data are involved in commercial practice (e.g., lot boundary surveys in congested areas) or when targeting precision is required in military applications.

Vector data have the advantages of storage efficiency and infinite resolution within the limits of accuracy and precision at which the data was acquired or can be computed or displayed. Vectors are appropriate for a wide range of spatial data, especially as found in maps, due to the fact that region boundaries tend to occupy only a small fraction of a map or image. Hence, the vector information can be extremely compact in relation to uncompressed raster data at equivalent resolution. Unfortunately, vectors imply a hard boundary model that does not match observations of gradations between region boundaries in the natural world. For example, consider the boundary between a meadow that is gradually submerged as it blends into an adjacent swamp. Where does the swamp start and the meadow end?

A serious drawback of the raster/vector dichotomy is the conversion between formats, which is a key topic in this course. Rasterization of vector data has associated quantization errors, as does vectorization of raster data. Furthermore, approximation errors in coarse-grained rasters can often lead to serious misinterpretations of data, with unfortunate results for GIS users who are not aware of these problems. In Section 2, we discuss the related problem of data format conversion and coregistration of datasets, with theory given for error analysis and profiling. The format conversion and merging of GIS datasets is a topic of keen research interest that is discussed in detail in this course.

1.3.2. Data Quality and Usefulness.

The value of an database (which the scope of GIS includes) derives primarily from the quality (e.g., accuracy, precision, scope, and depth) and usefulness of its data. The following issues pertain:

A database is a repository of data that should be logically unified but may be physically distributed. GIS databases are created and maintained using database management systems (DBMS), which have the following requirements for usefulness:

Associated requirements also exist for data capture and display. Since the focus of this course is on data manipulation, we simply assume that data carry a prespecified error and concentrate on error propagation. Worboy's excellent text [Worb95] and Macguire's comprehensive text [Macg91] both furnish ample discussion of the sources and propagation of errors in GIS datasets.

1.4. GIS Software and Data Analysis Tools.

Thus far, we have discussed GIS modes of operation, including data generation, storage, modification, and retrieval. The utility of graphical interfaces for manipulation of data in terms of map coordinates has also been mentioned. In addition to map creation and printing and the formatting and printing of data analysis reports, these can be viewed as the basic capabilities of GIS.

1.4.1. Software Capabilities.

In order to provide these capabilities to users, GIS systems usually include the following software modules:

1.4.2. Graphics Capabilities.

Graphics is a key component of GIS that has progressed from low-level routines that were device- and manufacturer-dependent, to general purpose device-independent software packages. Modern graphics hardware and software can be used by many different software packages (e.g., spreadsheets, word processors, drawing programs, computer games, multimedia software, and GIS). Until the 1990s, most graphic displays were two-dimensional, but 3-D software and rendering devices are now widely available.

Graphics software typically provides the capability of performing many operations (e.g., rotation, translation, scaling, reflection) on a collection of graphics primitives (e.g., point, line segment, polyline, B-spline, rectangle, circle, and ellipse). User-friendly features such as differing pen widths, patterns, colors, and shape modification extend previous graphical interfaces to be quite paper-like. This conformance to evolved manual practice is a win-win situation for the user and the software manufacturer. That is, the user is able to work within a paradigm that has evolved to meet human needs and limitations, which is familiar to him. The software manufacturer is able to provide a convenient, efficient user interface with a "look and feel" that is familiar to the user. Thus, the user is happy with the software, which can increase productivity, and the manufacturer has a loyal customer base, which could increase profitability.

Common standards for graphics packages are the Graphics Kernel System (GKS), established in 1985 and extended to three dimensions in 1988, and the programmer's Hierarchical Interactive Graphics System (PHIGS, 1988). PHIGS supported full three-dimensional capabilities from the outset, and allows the definition of hierarchies of graphics primitives. Furthermore, PHIGS provides the convenient feature of defining 3D primitives in device-independent coordinate space.

1.4.3. Hardware Customarily Employed with GIS.

GIS, having evolved in part from the CAD/CAM field, in part from cartography, and being increasingly involved with image processing, has a diverse base of hardware requirements and resources. A brief list of useful hardware devices follows:

We next consider on-line GIS resources that provide lists of software implementations, as well as available GIS data.

1.4.4. GIS Resources on the World-Wide Web.

Here follows a list of pointers to Web pages that contain information about GIS software and mapping capabilities. This list will be updated as the course progresses.

The United States Geological Survey's GIS Page has numerous links to access various types of data. Here follow several key links for cartographic and map data:

1.5. Map Projections - Theory and Practice.

Many of us have scanned through atlases and have seen a given map displayed in several different ways. For example, consider the polar projection, which looks like a bird's-eye view of the Earth. The Mercator projection allows most (but not all) of the globe to be mapped to a cylinder that can be "unrolled" to yield a flat map. Similarly, there are map projections that maintain the continental areas approximately in the same proportion as they are when one rotates a globe of the Earth and inspects the continents visually. An example of this projection is the equal-area projection.

In this section, we discuss map projections in terms of underlying concept, theory, and practice. Much of this discussion is based on Maling's excellent summary [Mali91] in [Macg91], to which the reader is referred for further detail.

1.5.1. Basic Concepts.

In GIS, map projections transform spatial data to facilitate coregistration with other spatial data. The results of analyses on such data can be output as cartographic documents called maps. The primary sources for GIS spatial data are the large databases of paper maps maintained by government entities. These maps, when digitized, are converted into machine readable form in one of two coordinate systems:

  1. Terrestrial Coordinates in three dimensions, denoted in the Cartesian coordinate system by (x,y,z); and

  2. Plane Coordinates in two dimensions (latitude and longitude), denoted by (, ).

Plane coordinates may be rectangular or polar coordinates, a raster grid, or a map projection. Prior to the emergence of high-capacity storage devices, plane coordinates were preferred to terrestrial coordinates, which are now being more extensively adopted.

The key reason for using map projections is the transformation of digitized map data into a uniform system of spatial referencing within a given GIS. This obviates preprocessing when layers of a GIS are compared, analyzed, or rendered, and is especially important for combining vector and raster data.

In this section, we briefly review concepts, theory, and practice of map projections, which are fundamental to GIS map rendering. This is not a course in map projection, but is designed to provide background so that students can understand subsequent terminology. A more detailed notational background in given in the Overview of Image Algebra

1.5.2. Fundamental Theory.

Definition. A geographic map a is a mathematical mapping from a spatial domain X (customarily a subset of Euclidean n-space) to a value set F. We write a : X -> F, which can be expressed more concisely as a FX.

Definition. If a FX, then X = domain(a) and F = range(a).

Observation. In GIS practice, X R3, the set of three-dimensional terrestrial coordinates.

Definition. The graph of a FX is denoted by:

G(a) = {(x, a(x)) : a(x) F, x X} .

Notice that this formalizes the association of a given map coordinate with its attributes in F.

Observation.We also write a G(a), where denotes equivalence. Note that p1(G(a)) = X = domain(a) and p2(G(a)) = F = range(a), where pk denotes projection onto the kth coordinate.

Map projections are based on spatial transformations, which manipulate the map domain.

Definition. Let a denote a map in FX, and define a spatial transformation f: Y -> X. For purposes of simplicity, assume that f is a one-to-one and onto mapping.

The composition of a with f is denoted by:

b = a o f {(y, b(y)) : b(y) = a(f(y)), y Y} .

Observation. Each map projection can be defined as the spatial transformation of a given map domain. Prior to the development of GIS, such transformations were well understood (since they were the basis for cartography), but inverse transformations were often not well defined. GIS requires that a map transformation have an inverse, so one can create, query, or modify GIS data in one map projection (which may be more convenient for that type of data), then revert to the previous map projection for further manipulation.

1.5.3. Map Projection Practice.

The simplest method for comparing map data is the grid cell, which comprises a close pattern of spherical quadrilateral cells that are derived by subdividing the spatial domain of a map into one degree or one-half degree units. Grid cells are not rectangular, since their sides are formed by curved lines that describe two meridians (lines of longitude) and two parallels (lines of latitude). The meridians converge to the poles. Therefore, grid cells are similar to a trapezoid.

GIS generalizes the concept of grid cells by using map projections to modify spatial data, for the following two reasons:

  1. The database for a large area (e.g., nation or continent) will be large and does not necessarily lend itself to a contiguous spatial model. Thus, it is reasonable to segment the map into grid cells that can be further subdivided using techniques discussed below.

  2. In the case of large areas, convenient approximations (e.g., a flat, plane Earth model with rectangular grid cells) yield infeasibly large errors, which are manifested as spatial distortions. Thus, a nonplanar map model must be employed. Conventions in GIS Map Projection Practice.

The GIS framework most likely to be employed in large-area surveys is the Universal Transverse Mercator (UTM) or the Lambert Conformal Conical (LCC) projection. However, since the coverage of GIS increasingly extends over multiple governmental entities, the different reference points and projections that are local standards must be reconciled. Although the need for a common map projection has been discussed extensively in the literature [Brig89,Moun91], little progress has been made in this area. For example, in the design of an environmental GIS for Antarctica, a collection of LCC and stereographic projections was used in the role of raster grids [Siev89]. This rather involved framework formed the mathematical basis for the system.

Another consideration is limiting resolution, i.e., the size of the smallest object that can be shown legibly on a map. This is usually assumed to be approximately 0.15mm (approximately 0.006 inch) [Mali91] and is often called the zero dimension (a term from earlier cartographic practice). If a particular computation affects spatial location by less than the zero dimension, then a simpler (e.g., approximate) computation may be used. In this sense, the zero dimension permits one to make assumptions about the accuracy of spatial data and its utility in various scenarios. This will be discussed in later sections of these notes and in the class.

An additional issue in GIS is that the complex shape of the geoid (Earth) can be approximated by a spheroid. Unfortunately, there is no standard spheroid, and hence map projections exhibit a certain amount of error that is a function of the accuracy with which their reference points are mapped to a given spheroidal model. A further problem is that some paper maps were made with particular spheroidal or ellipsoidal models in mind. Given these more complex (or different) models, errors accrue when digitizing paper maps to fit simpler models (e.g., planar or spherical models). For local surveys, planar models may be apropos. (Aside: This leads to the interesting implementational concept of the surface of a flatbed plotter as a datum plane for each map produced on that plotter.) Examples of Map Projections.

The Cartesian coordinates (x,y) of a point on a map are related to latitude and longitude via the projections x = fx(, ) and y = fy(, ), which can be computed via three methods:

  1. Analytical transformation, whereby points are located and plotted from their geographical coordinates, in an approximation to classical cartographic methods;

  2. Direct or grid-on-grid transformation, whereby a map domain referenced to a grid is warped using, for example, a linear or affine transformation; and

  3. Polynomial transformation, which provides a mechanism for allowing control points within a grid cell to constrain the projective geometry (which need not be linear).

The conversion from geographical to plane coordinates is the normal practice in cartography, which is called forward transformation. The inverse transform, which yields geographical coordinates from map coordinates, is a more recent development, due to the need for spatial format conversion (e.g., transformation between different map projections) in GIS. The vast majority of map projection texts only provide theory for forward transformations. In this review, we will exemplify both forward and inverse transformation using the Mercator projection.

1.7. Satellite and Airborne Imagery Libraries.

Many USGS data products are available in the Spatial Data Transfer Standard (SDTS) format, a new Federal standard to ensure data compatibility, or in the older Digital Line Graph (DLG) format. Others are in more specialized formats, which are described in their metadata. Whenever you retrieve a data set, it is important to retrieve the metadata as well, since the metadata provides important information for using the data set. The Internet browser software used to view this page is likely able to print the metadata files or save them locally.

Several sites where map and imagery data may be obtained follow:


{Doa97} Doaks, J. Another Introduction to GIS, New York: Bogus Press (1997).

{Doe97} Doe, J.H. An Introduction to GIS, New York: Bogus Press (1997).

This concludes our introductory discussion of GIS issues.
We next consider computational problems related to GIS features.