Daniel Elroi
GIS Manager
Knight Piésold & Co.
1600 Stout Street, Suite 800
Denver, Colorado 80202
ABSTRACT
A successful GIS implementation not only plans ahead but also keeps a record of its own past. At some point in their future, most GIS installations face the challenge of showing how data evolved to their present state, or how a decision was made using the GIS. Such justification may be required for economic reasons (e.g. to justify a recommendation), or for legal reasons (e.g. to defend a decision). Therefore, documentation should be a key component of every GIS installation.This paper presents two case studies illustrating proven methods for documenting GIS databases. The first comes from the mining industry, where documentation is used to lay the foundation for spatial databases that often will be used for many decades. The second comes from a city planning setting, where automated documentation helps track zoning and comprehensive plan changes.
The need for documentation also extends to computer applications constructed around GIS databases. Documentation helps to prevent rapid data and application obsolescence, which is ordinarily brought on by changes in personnel, the introduction of newer technologies, and even the fading of individuals' memories. This paper discusses internal program documentation, user manuals, and programmer guides, and is illustrated with examples.
INTRODUCTION Regardless of how new or well established a GIS installation is, the need for careful documentation never diminishes. To the contrary, documentation can be likened to an insurance policy whose value can only be truly recognized after a misfortune occurs. And like an insurance policy, documentation can be expensive and appear superfluous, but is useless if it is permitted to elapse and become obsolete. Documentation in the GIS arena permeates all aspects of the lifecycle of the operation, most especially those of database design and data conversion as well as during the creation of application programs. Documentation may appear to be a superfluous academic exercise at first. This image is easily dispelled, however, the first time documentation is shown to save a GIS installation money, time, prestige, or support. It is the purpose of this paper to show both why and how documentation should be applied to GIS projects.
THE NEED FOR DOCUMENTATION
Sometime, somewhere, somebody is going to need to know the precise history and chronology of some aspect of the GIS, and if that need cannot be satisfied, the results can have severe ramifications. Here are a few examples of why a database needs to be documented:
Years after photogrammetric work is performed to provide base data for a GIS, a user needs to know the precision and accuracy of the original mapping, in order to predict the level of accuracy of computations performed with these data, such as areal and volumetric computations. | |
Mapping needs to be expanded beyond the initial mapped area, and the specifications used for the original mapping need to be known so that the two phases of mapping produce compatible results. | |
Following political or legal pressure, or merely as a result of the expiration of regulations, zoning designations governing real property, need to be rolled back to their original status. | |
Following a failure in design, a consultant needs to justify decisions made based on a GIS database, in order to prove that the premises of the design were sound, and that fault lies elsewhere. | |
Due to a lawsuit, a company needs to prove that it had taken all reasonable precautions in designing and operating its facility. | |
In addition to documenting the history of the data contained in a GIS database, there is also a need to document the processes applied to GIS data in obtaining analytical results. This is important in the following scenarios: | |
A siting study progresses all the way to the design phase before a flaw is discovered and the site selection has to start over. | |
Updated data need to be incorporated into an existing analysis, such as the selection of a migration corridor for an endangered species. | |
An analytical process needs to be documented and explained in court, in defense of a lawsuit. An analysis performed in one project needs to be repeated in another, similar project. | |
Beyond knowing how data evolved over time, and what analytical processes were applied to them, it is also imperative to document the way in which these processes were applied to the data by means of application programs. Here are examples of situations in which this becomes important: | |
The government regulations embodied in an application program change, requiring a modification in the program, which must be performed by a programmer other than the original programmer, or even by the original programmer, but many years after the original program was written. | |
New employees are hired who need to learn to use programs which were installed a long time in the past. | |
A project needs to be resurrected with updated criteria after being archived for a long time, or a project needs to be duplicated with brand new criteria and data. |
For these reasons, and many more, it is crucial for a GIS manager to budget for, and insist on, meticulous database, process, and programming documentation, as a form of insurance against such eventualities. Likewise, organizations seeking outside help with their GIS, need to figure the cost of documentation into their budgets.
DATABASE DOCUMENTATION
In GIS, database documentation can be organized into feature, entity, or thematic documentation, depending on the nature and needs of the GIS installation.| Feature-level documentation |
Feature-level documentation refers to the documentation of individual geographic elements, i.e. points, lines, and polygons. This is an intensive form of documentation since it attaches specific historical information to each and every graphic element in the database, which consumes both storage space and operator and processing time. This type of documentation is often used in databases where for legal or commercial reasons it is important to track the genesis and metamorphosis of each element. A classic example is the street centerline data maintained by Thomas Bros. of California, who have pioneered the use of GIS in the production of street atlases. Each street segment is individually documented, and never discarded, only moved to another map layer. This type of documentation should permit users to create feature-level rollbacks, and then possibly even rolls forward in the event that a feature was erroneously rolled back. A popular GIS database structure which incorporates feature-level documentation principles, is the TIGER file structure. The very first item in the TIGER file records the version number of each line segment, and this is complemented by the assignment of an individual record number (RECNUM) to each line segment throughout the country. Although the version item is provided for the use of the Bureau of the Census, it can be used by the user of the data to apply codes which explain the changes applied to each line segment. Feature-level documentation is only recommended in situations where the elements, or features, within layers are usually updated individually, and where these changes vary in character from feature to feature. In other words, this level of documentation is appropriate when changes cannot be easily described, and where every change is unique.
| Entity-level documentation |
Entity-level documentation pertains to GIS data structures that are object-oriented in some fashion, and which treat a group of features as logical entities. This type of structure is not commonly found in the popular GIS packages, but where it is found, it is usually associated with utility databases, such as electric and telephone utilities. In such databases, points, lines, and polygons make up such entities as electricity poles, transformers, and sub-stations. Historical information sometimes needs to be applied to individual features, such as when individual conductors within a sub-station were replaced, and sometimes to whole entities, such as when a light pole blew down in a storm, along with all the conductors and switches associated with it. This type of documentation is especially worthwhile in very dynamic, facilities-oriented databases. In these situations it is sufficient to describe what has been done to a logical grouping of features, and not to each graphic element individually.
| Theme-level documentation |
Theme-level documentation refers to the documentation of a whole theme, layer, map, or coverage. This is a more economical method of documentation, since only one record needs to be maintained for each theme, and its various versions. The type of information that might be kept is the source of the theme, its accuracy and precision, method of conversion, and any alterations done to the data. This type of documentation assumes that changes to the theme are either few, or are fairly global, so that the change is easily described in the documentation.
METHODS OF DATABASE DOCUMENTATION
Database documentation can range in complexity from simple handwritten notes to a fully automated, online database tracking system. The level of sophistication depends on the amount of database complexity, number of changes, perceived benefit in documenting the database, the capability of the GIS system, and the technical adeptness of the users of the system.Paper trail |
Keeping a paper trail is the most basic method of database documentation, and is also one which remains crucial in some form or another as a complement even to the most sophisticated of automated methods. It is highly informative to keep a well-organized book or binder containing original or copies of source documents used to create and update the database. The documents can be referenced to the database itself by also including a simple representation of the documented database theme in the documentation book. It is also helpful to include a standard form which an operator or the system administrator can fill out whenever an activity takes place. This form of documentation, if arranged well, can serve well as an historical record of the database. It is best suited for theme-level documentation, merely for the fact that it would be very tedious to manually document changes made to individual features or even entities.
| Computerized documentation |
The form-filling part of the documentation method described above, can be replaced by an on-line method, either using a word-processor, or some form of a database manager. Ideally, this form of documentation should be linked or attached to the GIS data itself, so that as the data evolves and changes, the documentation automatically accompanies it. An example of this might be a text file in which the user can log changes, or a database file which is automatically duplicated when an update copy is made of a database layer. This form of documentation is also better suited for theme-level, or possibly entity-level documentation, because like the manual method, it depends on human patience and attention to detail to maintain. This method benefits from keeping copies of source materials just as much as the manual paper trail, and for the same reasons.
| Automated documentation |
A sophisticated GIS system can implement an automated database tracking and documentation method, documenting changes all the way from theme-level to feature-level. Such a method relies on a combination of intelligent user-interaction and an ability of the GIS to record activities performed on the database by the user. For example, the computer can automatically assign a flag, or change the ID of an individual feature, such as a line, whenever it is moved, or its attributes are changed. Another variation of this method places each deleted feature, and the previous version of each altered feature, in a duplicate layer, thereby documenting changes visually.
COMPUTERIZED THEME-LEVEL DOCUMENTATION: KNIGHT PIÉSOLD & CO., DENVER, CO
The document file is used in two ways. First, the file can always be listed on the screen while work is being performed on the coverage. This file can also be consulted before analysis is performed, to determine whether the coverage contains the necessary type and quality of information necessary. The documentation file is also printed out and filed in a Spatial Database Document catalog. This ring-bound book contains summary plots depicting the essential elements of each coverage, a printout of the document file, a full description of the coverage, as well as all supporting source materials used to create or alter the coverage. Users of the database, other than those immediately involved with the administration of the database, receive a shortened version of this catalog. This ensures that all users are aware of which data are available in the database, as well as the latest status of the database, and the availability of the latest GIS resources.
HYBRID AUTOMATED FEATURE-LEVEL DOCUMENTATION: CITY OF LOS ANGELES, CA
The City Planning
Department of the municipality of Los Angeles has been using the ARC/INFO GIS software to
maintain its zoning and General Plan maps for many years. Gradually, set of macros has
been developed to document transactions which are of a simple entity-level nature. These
macros automatically document graphic or tabular changes, or both. Because the macros
still rely on human interaction, they might be termed a hybrid of computerized and
automated methodologies. Typical transactions consist of zone designation changes, polygon
splits, graphic modifications, and repairs resulting from a quality control process.
Because these changes are legal changes, it is important for the City to carefully
document them, in the event that they will have to be legally defended in the future.
Furthermore, if any of these changes turn out to be based on erroneous information,
documentation helps in rolling back these changes.
The type of documentation used at the City of Los Angeles might be termed simple entity-level documentation because any of these changes, whether affecting lines or polygons, are documented as they relate to polygons. For example, if the boundary line between a commercially-zoned area and a residentially-zoned area is moved, it is the two affected polygons which are documented, not the line.
Each time a change is made, and several dozen may be made on any workday at the City of Los Angeles, the labels associated with all affected polygons are reassigned an identification number. A short form menu prompts the operator to indicate the nature of the change and the source used to make this change. The system then records the date and the operator's name, and produces an 8.5 x 11 plot depicting the change. This plot shows the portion of the graphical database immediately surrounding the affected polygon, as well as lists its attributes, before and after the change. The other information recorded in the form menu is also recorded on this plot. These plots are placed in binders, accompanied by copies of the source materials which justified the changes. The macro system also prints out a weekly summary of all the changes in tabular form, using the identification number as the key item. This identification number is also designed in such a way that it helps future users easily determine the time when a change occurred, and therefore help locate the appropriate documentation, which is arranged chronologically.
PROCESS AND PROGRAMMING DOCUMENTATION
The documentation of both the analytical process applied to GIS data and of applications programming performed around GIS data, is best addressed together. The reason for this is that analytical GIS work is almost always performed by means of programs. Whether these programs are part of a comprehensive, user-friendly, multi-task application, or not, the documentation methodology is similar. Such documentation can be divided into three components: internal program documentation, user manuals, and programmer guides.Internal program documentation |
Every program, whether it can perform alone, or is an integral part of a larger application, should be well documented internally. A statement of the purpose of the program should be followed by all the conditions required for the program to operate, such as required arguments, global variables, and file and user inputs. This header information should also detail all the program outputs, and any other programs which are called by the program. If specific logic, a formula, or special parameters are used in the program, they should be clearly documented, so that they can be easily changed. This is where the analytical process embodied in the program is documented. The program should then be broken up into manageable parts, which are individually documented, so that a programmer can easily debug and modify the program in future.
Internal program documentation is the closest parallel between process and application documentation and database documentation, because it aids in justifying or defending both analytical processes and application programs.
User manuals |
User manuals should guide the users of programs or applications in their proper use. They should also document the assumptions and processes used in the programs, much as the internal documentation should, but in a way that is oriented towards the user, rather than the programmer. User manuals can be published on paper, or be provided as on-line help-files. They can be structured as tutorial manuals or as command references. Without user manuals, users are left to guess or assume the optimal and correct way of using the programs. User manuals provide the justification and defence for an analysis performed by the user. In the event that the programmer and the user are the same person, the need for user documentation is reduced or eliminated altogether, unless of course the application is passed on to other users or the original programmer is replaced.
Programmer guides |
Programmer guides help programmers modify programs or applications in the future, whether such programmers work for an organization that purchases the programs, for the same organization that wrote them, or indeed, for the same programmers who wrote these programs. Such guides instruct the programmer in the structure and function of each program, as well as explain the logic applied to tying in various programs into a single application. Preferably, they anticipate the type of changes that will most likely be required in future, and point the programmer directly to those places that will need to be changed. Careful documentation can eliminate costly re-learning of these programs and their logic, as well as eliminate erroneous presumptions. They can make complex programs be open to modification by less sophisticated programmers, thereby making them flexible and durable in the long run.
COMPLETE PROCESS AND PROGRAM DOCUMENTATION: GeoMaP CONSORTIUM, MD
A good example of process
and program documentation can be found in certain applications written for the GeoMAP
consortium of local governments, in suburban Washington DC. One such application is a
plot-generating module, which works as the graphic output portion of a comprehensive
user-interface to ARC/INFO. Each macro program is documented with a standard header, and
follows a standard structure. Ample comments are provided in the body of the programs. In
addition to providing usage information in the header of each program, certain programs
which are directly addressable by the users, are documented in a user manual. This manual
closely approximates the format of the user manuals that are supplied with the ARC/INFO
software. In the latest version of ARC/INFO it is possible in fact to write command
references for newly created macros and completely incorporate them into the on-line user
documentation. Finally, some of the programs which are most likely to need occasional
modifications, are accompanied by a programmer guide. This guide explains to the
programmer exactly which lines need to be duplicated and modified in order to achieve
various effects. Together, these documentation components are designed to ensure the
long-term usefulness of the applications, and to help the team that wrote the programs to
effectively turn them over to the end-users.
CONCLUSION
Documentation is expensive, but like an insurance policy, not only costs more the greater the risk associated with not maintaining it, but also can save far greater expenses in the event that it is called into action. Unfortunately, since the cost of GIS can be great to begin with, there is often a tendency to skimp on documentation. Therefore, documentation needs to be addressed and budgeted for at the onset of a project. This is difficult to do for consultants and other competitive bidders, and therefore must be required by the end users of GIS. Whether GIS databases and applications are developed internally or not, those people responsible for specifying the work must be aware of the need for documentation and demand it. Failure to do this not only leaves the GIS installation vulnerable to hazards due to liability, but also ties the future success of the installation to specific individuals, who then can wreak chaos by departing.