University of Maryland
Undergraduate Honors Project
Date: May 2004
Cartographic generalization is ``a constraint-based process used by cartographers to reduce the complexity of a map in a scale reduction process.''  Until recently, cartographic generalization was performed manually by humans -- specifically, by trained cartographers with significant experience in the area. However, automatic generalization by computer has been a research area in both computer science and cartography for many years, in an attempt to codify a set of rules that make up generalization. Still, because of the subjective and artistic decisions that go into this aesthetic process, there is no one algorithm that performs adequately on all maps.
McMaster and Shea  list ten spatial transformations that can be applied during cartographic generalization. These include simplification, smoothing, aggregation, amalgamation, merging, collapse, refinement, exaggeration, enhancement, and displacement. In this study, we focus on simplification, which selects the characteristic or shape-defining elements of a given feature; and refinement, which is selecting the most essential features of the collection. Specifically, we will study applications of these transformations to networks of polylines.
The PM quadtree  refers to a set of hierarchical data structures designed to store line segments in an orderly fashion to speed up their search and retrieval. These structures work on the principle of dividing a two-dimensional region repeatedly into four quadrants, and only storing a subset of the line segments in each quadrant.
In this study, we focus on the PM quadtree. This structure stores a set of line segments by continuously splitting a region into four equal-size quadrants until each quadrant contains at most one line segment. Quadrants may contain more than one line segment only if all those segments intersect at a common point in that quadrant. Each quadrant may contain no more than one point.
The PM quadtree is in reality a trie structure. Since the hierarchical decomposition is always regular and never considers the current line segments in the tree, all of the data is stored at the leaves of the tree. Therefore, internal nodes of the tree are only used for searching purposes, and do not store any line segments.
This occurred due to the design specification of the PM quadtree: the quadtree only stores line segments, and maintains no records of segments that happen to belong to the same polyline -- all line segments are created equal. Therefore, when our generalization algorithm only displays line segments at certain depth, some polylines will have pieces missing in the middle, resulting in an unusable map.
To rectify this situation, we can have a line segment maintain a record of which polyline it belongs to, and also modify the definition of the PM quadtree to help us in generalization. This will give us a new data structure that can store a polygonal map at multiple resolutions simultaneously, while retaining all the benefits of the quadtree. The main idea is to put the internal nodes of the tree to work: at each internal node, we store a generalized representation of the line segments contained in the subtree rooted at that node. At higher levels in the tree, the subtrees will become larger, necessitating more levels of cartographic generalization. The end result will be a structure storing multiple versions of a polygonal map, one for each level in the quadtree.
The idea behind the algorithm is that a polyline is never generalized and stored at some level of the quadtree if it can fit completely within some child node. This forces smaller, ``less important'' lines to lower depths in the tree. Once this critical level is found, each polyline is generalized down to three points, using some predefined method to select the significant middle point. Currently, our algorithm chooses the median point in the polyline as the significant point.
To actually display the map at a certain resolution, all line segments at a certain ``cutoff'' depth (supplied by the viewer) are displayed. This results in a generalized subset of the polylines in the map to be shown.
This algorithm was also tested on a road network of the city of Alexandria, Virginia. The results are displayed in figures 2-5. These figures show the resulting generalized road networks for depth levels 0, 1, 2, 3, 4, 5, 8, and 21. At level 21, all the roads are drawn exactly as specified, with no generalization. Clearly, as the depth increases, the resolution also increases: the shapes of the roads become more defined, and smaller roads appear as well. With a map of this size and the naked eye, it is almost impossible to tell the difference between generalization level 8, and level 21, the maximum resolution. This tells us that for many applications, a significantly generalized polygonal map will suffice in place of the ``perfectly drawn'' one. An additional benefit gained by this observation is that when displayed in real-time on a computer monitor or other device, these generalized maps will display much faster than the ungeneralized ones. This is because significantly fewer line segments need to be drawn to visually represent the polyline network.
Currently, when the algorithm is instructed to pick the ``most significant'' point in a given polyline, it simply chooses the median point. In other words, if the points in the polyline are numbered from 1 to , it picks the point numbered . Occasionally, this results in some strange-looking results. However, there are a number of well-established line simplification algorithms, one of the best-performing being the Douglas-Peucker algorithm . This algorithm works well because it uses a holistic approach for generalization, rather than a local one used by many other algorithms. By integrating Douglas-Peucker into our algorithm, it is likely that the generated maps will give a better overall visual impression of the underlying data.
A problem that can occur with our algorithm is that occasionally, two polylines that cross in the original dataset, when generalized, may appear to intersect at a different point than in reality. It is also possible for two roads that do not intersect at all to be displayed crossing each other. This can be seen in Figures 2 and 3 -- specifically, the two roads near the bottom of the images of generalization levels 1 and 2.
The explanation for this phenomenon is as follows. Consider the case where two polylines have some point in common. If this point happens to be ``generalized away'' and both of the resultant generalized polylines are displayed, it is likely that they will appear to intersect at a completely different point altogether.
The fix for this problem is to consider all local polylines when generalizing a specific one. The algorithm should never remove an intersection point in a polyline unless all other polylines sharing that point will not be displayed at the current generalization level in the tree.
A second problem that can occur is when an otherwise insignificant road happens to cross a boundary that separates regions at a high level in the quadtree. In this case, the polyline will be given more significance than it deserves, and therefore displayed at higher generalization levels than it otherwise should be. This can be seen in Figure 6, which shows the result of generalization level 2 on the roads of Arlington County, Virginia.
This document was generated using the LaTeX2HTML translator Version 2002 (1.62)
Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer
Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -split 0 -nonavigation -antialias_text -antialias honors
The translation was initiated by Phillip Kirlin on 2004-05-10