Cartographic Generalization of Polylines
Stored in Quadtrees

Phillip Kirlin
University of Maryland
Undergraduate Honors Project
pkirlin@umd.edu

Date: May 2004

Abstract:

When geographic maps are displayed at lower resolutions than at which they were originally designed, it is common for features to appear too close together to distinguish between them, or too small to see at all. Cartographic generalization seeks to resolve this problem by applying various transformations to the features on the map, hopefully resulting in a new map that is easier to read and use. While there are numerous algorithms in use today, there is still no general algorithm for cartographic generalization that works well in all situations. In this paper, we explore a generalization algorithm for polylines (representing roads) stored in PM quadtrees.

Introduction

When a map is reduced in scale, there is a tendency for the clarity of the map to be reduced as well. Features such as roads, rivers, or cities often end up so close together in the reduced-scale map that they end up either colliding, coalescing, or becoming so minuscule in size as to essentially be invisible to the person using the map. Cartographic generalization attempts to resolve this issue by redrawing the map in a different fashion when the scale is reduced, producing a more useable, smaller-scale map.

Cartographic generalization is ``a constraint-based process used by cartographers to reduce the complexity of a map in a scale reduction process.'' [2] Until recently, cartographic generalization was performed manually by humans -- specifically, by trained cartographers with significant experience in the area. However, automatic generalization by computer has been a research area in both computer science and cartography for many years, in an attempt to codify a set of rules that make up generalization. Still, because of the subjective and artistic decisions that go into this aesthetic process, there is no one algorithm that performs adequately on all maps.

McMaster and Shea [3] list ten spatial transformations that can be applied during cartographic generalization. These include simplification, smoothing, aggregation, amalgamation, merging, collapse, refinement, exaggeration, enhancement, and displacement. In this study, we focus on simplification, which selects the characteristic or shape-defining elements of a given feature; and refinement, which is selecting the most essential features of the collection. Specifically, we will study applications of these transformations to networks of polylines.

Polylines and Quadtrees

In cartographic data, a polyline may represent a road, river, trail, boundary, or any sort of linear feature that may contain multiple points. A road network, for example, may be stored as a list of polylines, one for each street or highway in the network.

The PM quadtree [4] refers to a set of hierarchical data structures designed to store line segments in an orderly fashion to speed up their search and retrieval. These structures work on the principle of dividing a two-dimensional region repeatedly into four quadrants, and only storing a subset of the line segments in each quadrant.

In this study, we focus on the PM quadtree. This structure stores a set of line segments by continuously splitting a region into four equal-size quadrants until each quadrant contains at most one line segment. Quadrants may contain more than one line segment only if all those segments intersect at a common point in that quadrant. Each quadrant may contain no more than one point.

The PM quadtree is in reality a trie structure. Since the hierarchical decomposition is always regular and never considers the current line segments in the tree, all of the data is stored at the leaves of the tree. Therefore, internal nodes of the tree are only used for searching purposes, and do not store any line segments.

A First Attempt

The first attempt at generalizing the line segments stored in a PM quadtree was not successful. The idea was to only display line segments that were stored at or above a user-supplied ``depth'' parameter. The result was a roadmap that looked like Swiss cheese. This can be seen in Figure 1, which used a dataset of a road network from the city of Alexandria, Virginia.

**Figure 1:** A less-than-satisfactory generalization
$\includegraphics[width=3in]{cheese}$

This occurred due to the design specification of the PM quadtree: the quadtree only stores line segments, and maintains no records of segments that happen to belong to the same polyline -- all line segments are created equal. Therefore, when our generalization algorithm only displays line segments at certain depth, some polylines will have pieces missing in the middle, resulting in an unusable map.

To rectify this situation, we can have a line segment maintain a record of which polyline it belongs to, and also modify the definition of the PM quadtree to help us in generalization. This will give us a new data structure that can store a polygonal map at multiple resolutions simultaneously, while retaining all the benefits of the quadtree. The main idea is to put the internal nodes of the tree to work: at each internal node, we store a generalized representation of the line segments contained in the subtree rooted at that node. At higher levels in the tree, the subtrees will become larger, necessitating more levels of cartographic generalization. The end result will be a structure storing multiple versions of a polygonal map, one for each level in the quadtree.

The Algorithm

We modify the insertion algorithm for the PM

quadtree to support inserting an entire polyline at once; this will solve the problem of losing pieces of roads when generalizing. The algorithm works by repeatedly breaking a polyline into sections at the points where it crosses a quadtree region boundary and generalizing each of those pieces separately. The end result is similar to Ballard's strip trees [1] in that we end up with a hierarchical decomposition of a polyline. The final algorithm to insert a polyline $P = (p_1, p_2, \ldots, p_n)$ in a quadrant

is given as Algorithm 1.

$\begin{algorithm} % latex2html id marker 34 [htbp] \caption{Inserting and genera... ...he four children \EndIf \EndFor \EndProcedure \end{algorithmic}\end{algorithm}$

The idea behind the algorithm is that a polyline is never generalized and stored at some level of the quadtree if it can fit completely within some child node. This forces smaller, ``less important'' lines to lower depths in the tree. Once this critical level is found, each polyline is generalized down to three points, using some predefined method to select the significant middle point. Currently, our algorithm chooses the median point in the polyline as the significant point.

To actually display the map at a certain resolution, all line segments at a certain ``cutoff'' depth (supplied by the viewer) are displayed. This results in a generalized subset of the polylines in the map to be shown.

Results

**Figure 2:** Generalization levels 0 and 1
$\includegraphics[width=3in]{alex0}$ $\includegraphics[width=3in]{alex1}$

**Figure 3:** Generalization levels 2 and 3
$\includegraphics[width=3in]{alex2}$ $\includegraphics[width=3in]{alex3}$

**Figure 4:** Generalization levels 4 and 5
$\includegraphics[width=3in]{alex4}$ $\includegraphics[width=3in]{alex5}$

**Figure 5:** Generalization levels 8 and 21
$\includegraphics[width=3in]{alex8}$ $\includegraphics[width=3in]{alex21}$

This algorithm was also tested on a road network of the city of Alexandria, Virginia. The results are displayed in figures 2-5. These figures show the resulting generalized road networks for depth levels 0, 1, 2, 3, 4, 5, 8, and 21. At level 21, all the roads are drawn exactly as specified, with no generalization. Clearly, as the depth increases, the resolution also increases: the shapes of the roads become more defined, and smaller roads appear as well. With a map of this size and the naked eye, it is almost impossible to tell the difference between generalization level 8, and level 21, the maximum resolution. This tells us that for many applications, a significantly generalized polygonal map will suffice in place of the ``perfectly drawn'' one. An additional benefit gained by this observation is that when displayed in real-time on a computer monitor or other device, these generalized maps will display much faster than the ungeneralized ones. This is because significantly fewer line segments need to be drawn to visually represent the polyline network.

Future Enhancements

There are a number of features that may be added to this algorithm to increase its effectiveness in creating more useful and aesthetically-pleasing maps.

Currently, when the algorithm is instructed to pick the ``most significant'' point in a given polyline, it simply chooses the median point. In other words, if the points in the polyline are numbered from 1 to , it picks the point numbered $\lfloor n/2 \rfloor$ . Occasionally, this results in some strange-looking results. However, there are a number of well-established line simplification algorithms, one of the best-performing being the Douglas-Peucker algorithm [3]. This algorithm works well because it uses a holistic approach for generalization, rather than a local one used by many other algorithms. By integrating Douglas-Peucker into our algorithm, it is likely that the generated maps will give a better overall visual impression of the underlying data.

A problem that can occur with our algorithm is that occasionally, two polylines that cross in the original dataset, when generalized, may appear to intersect at a different point than in reality. It is also possible for two roads that do not intersect at all to be displayed crossing each other. This can be seen in Figures 2 and 3 -- specifically, the two roads near the bottom of the images of generalization levels 1 and 2.

The explanation for this phenomenon is as follows. Consider the case where two polylines have some point in common. If this point happens to be ``generalized away'' and both of the resultant generalized polylines are displayed, it is likely that they will appear to intersect at a completely different point altogether.

The fix for this problem is to consider all local polylines when generalizing a specific one. The algorithm should never remove an intersection point in a polyline unless all other polylines sharing that point will not be displayed at the current generalization level in the tree.

A second problem that can occur is when an otherwise insignificant road happens to cross a boundary that separates regions at a high level in the quadtree. In this case, the polyline will be given more significance than it deserves, and therefore displayed at higher generalization levels than it otherwise should be. This can be seen in Figure 6, which shows the result of generalization level 2 on the roads of Arlington County, Virginia.

**Figure 6:** Insignificant roads being erroneously displayed
$\includegraphics[width=3in]{arl_bad}$

Looking at this image, there are two vertical ``bands'' of short roads that appear to be out of place, and correspond to where the quadtree region boundaries are located. This issue could be eliminated by taking into account other factors besides crossing a region boundary, such as the length of a polyline.

Conclusion

This paper discusses a new cartographic generalization algorithm that works fairly well when applied to polygonal maps representing street networks. Generalization algorithms of this type are important because they can produce new maps that are much easier to interpret at lower resolutions, and also can be displayed, real-time, at much higher speeds on computer monitors. The algorithm is based on the idea of hierarchically decomposing a polyline into pieces, and storing the more significant pieces at higher levels in a quadtree. This ultimately results in a data structure that can store and display a polyline network at multiple resolutions, while retaining all the efficiencies inherent to quadtrees.

Acknowledgments

The author would like to thank his honors adviser, Dr. Hanan Samet, for his help and advice with this project. The author would also like to thank Frank Morgan, Jagan Sankaranarayanan, and Houman Alborzi for the many useful conversations that occurred during the work on this project. All spatial data was obtained through the United States Department of Transportation, Bureau of Transportation Statistics.

Bibliography

1: Dana H. Ballard.
Strip trees: a hierarchical representation for curves.
Commun. ACM, 24(5):310-321, 1981.
2: B. Jiang and C. Claramunt.
A structural approach to the model generalization of an urban street network.
Geoinformatica, 8(2):157-171, 2004.
3: Robert B. McMaster and K. Stuart Shea.
Generalization in Digital Cartography.
Association of American Geographers, 1992.
4: Hanan Samet.
The Design and Analysis of Spatial Data Structures.
Addison-Wesley Longman Publishing Co., Inc., 1990.

About this document ...

Cartographic Generalization of Polylines
Stored in Quadtrees

This document was generated using the LaTeX2HTML translator Version 2002 (1.62)

The command line arguments were:
latex2html -split 0 -nonavigation -antialias_text -antialias honors

The translation was initiated by Phillip Kirlin on 2004-05-10

Phillip Kirlin 2004-05-10

Cartographic Generalization of Polylines Stored in Quadtrees

Abstract:

Cartographic Generalization of Polylines
Stored in Quadtrees