..


1 de Julio de 2000 Vol. 1 No.1


INTEGRATED TECHNIQUE FOR AUTOMATED DIGITIZATION OF RASTER MAPS
Serguei Levachkine and Evgueni Polchkov

(continue...)

3 Raster Map Processing

The main goal of this, the principal stage of automated vectorization of raster maps, is that of recognition of cartographic images; i.e. generation of vector layers and attribute information in electronic maps. From our point of view, the most promising line of software development is the creation of methods, algorithms and programs which focus on locating and identifying specific cartographic objects. Each cartographic image has its own graphical representation parameters, which can be used for automated object recognition on a raster map. The particular attributes depend on the topological class of the object. In traditional GIS, vector map objects are divided into three types; points, arcs and polygons, representing respectively point, linear and area objects. This classification can easily be extended to analysis of cartographic images in raster maps. Objects are drawn on thematic maps in the form of graphical symbols, which are the same for all objects in a given group. Graphical images have geometric (location) and attribute (quantitative and qualitative parameters) information, which we combine to form the concept of a cartographic image. The main geographical coding attributes of cartographic images of the three classes are shown in Table 4.

Object type

Graphical representation attributes

Point

Shape

Size

Symbol

Arc

Type

Color

Thickness

Polygon

Area

Outline

 

Fill

Crosshatching

Crape

Type

 

Color

Type

Color

Thickness

Angle

Density

Shape

Size

Pattern

Density

Color

Thickness

Table 4. Main attributes used for graphical coding of cartographic images.

The classification of cartographic images is different when the vectorization of raster maps is considered. All objects on a raster map have area, and in this sense they are all polygons. It is not an easy problem to reduce the graphical coding elements of cartographic images to elements that correspond to the geometric categories "point" (a coordinate pair), "line" (a sequence of coordinate pairs) and "polygon" (a closed set of line segments which do not intersect and form the border of a geometrical figure). However, clearly the classification of point, linear and polygonal objects must be preserved, because we can omit the relative stretch of the cartographic images in one or two directions (respectively lines or points) with respect to the stretch of the map field. A recognition program that recognizes, for example, point objects, does not have to distinguish between the point cartographic image itself or an element of a polygon fill pattern.

Note that there may be other graphical objects involved in recognition of cartographic images, which are nearly always absent from raster maps. Principally, these are letters and digits (toponyms and quantitative and qualitative characteristics of objects). Additionally, there may be other graphical elements in the map (footnotes to lines, insets, etc.) It is thus convenient to use the classification presented in Table 5.

Type of object recognized

Cartographic images and their elements

Point

Symbols of point objects

Element of polygon fill pattern

Arc

Symbols of linear objects

Explicit polygon borders

Crosshatched line

Polygon

Symbols of polygonal objects with implicit borders given by:

Solid fill

Crosshatching

Pattern fill

Text

Toponyms

Altitude marks

Road distances

Parameter values on the contour lines

Tags on geodetic points, hydrometric monitoring posts, etc.

Additional graphics

Text footnotes

Guides

Tick-marks on contour lines

Insets, etc.

Table 5. Cartographic object classification from automated vectorization point of view.

An important element of an automated raster map technique is the development of an optimal sequence for cartographic image recognition, successively eliminating elements already decoded from the raster map field and restoring images which were hidden by the eliminated elements. The basic principle of this optimized ordering must be "from simple to complex". Nevertheless, the possiblility of using information from objects already digitized (whether manually or by an automated system) must be provided for in the development of a recognition strategy. For example, the point layer of hydrological monitoring posts can be successfully used for recognition of linear elements of the river network. Moreover, the symbols for these posts generally cover images of the river, complicating automated identification of the rivers. Taking this into account, it becomes clear that hydrological monitoring posts must be vectored before the river network is digitized. Eliminating them from the raster map, one can use their locations and attribute data (mainly altitude marks) to aid in recognition of elements of the river network.

Further developing this approach, it is suggested to use already existing small scale vector maps for recognition of corresponding cartographic images on large scale maps. A small scale map contains generalized (in a broad sense) information about a considerable proportion of the objects on the corresponding large scale map. As a rule, the generalization involved in decreasing the map scale consists in the simplification of the geometric shape of the object and the elimination of a part of the object. For example, on a large scale map, a river is represented by a polygon, but on the small scale map, as a line. In general, a given object can be expected to change in topological type when the degree of generalization changes. Even if the topological type of an object is preserved after generalization, several objects on a large scale map may correspond to a single object on a small scale map. Examples of the correspondence between the objects in maps of different scales are presented in Table 6.

Scale

Examples

Small

Large

Small

Large

Point

Point

Altitude marks

Altitude marks

 

Line

Out-of-scale irrigation block

Irrigation channels

 

Poly-gon

Out-of-scale populated place

Territory of populated place

Line

Point

Dotted lines

Separate dots

 

Line

Electric transmission line

Electric transmission line

 

Poly-gon

Line of riverbed

River water area

Poly-gon

Point

Watershed

Separate dots

 

Line

Watershed

Dotted lines

 

Poly-gon

Bog

Bog region

Table 6. Correspondence of cartographic objects between maps of different scale.

The use of small scale maps solves a difficult problem in automated digitization; the search for objects in the whole raster map field. In this case, a vectored object can be found in the nearest neighbor of its generalized analogue, and nowhere else.

The search zone for paired point objects can be restricted to a circle with a radius defined by the correlation between the scales of the vectored maps and the maps used.

We suggest the use of the "caterpillar" algorithm (the name reflects the shape of the illustrated algorithm) for searching for paired linear objects. The caterpillar algorithm involves the construction of a system of line segments perpendicular to the contour of their small-scale analogue, divided in half by it. The length of each segment can be chosen by the correlation between the scales of the maps used and their density, i.e. by the curvature of the generalized line. Moreover, the search object is located along segments constructed in this way. The sequence of reference points of the search curve can thus be found. The reference points obtained can be joined by straight line segments in an interactive digitation system without any intervention by the operator.

Automated cartographic image recognition is simplified and its reliability increased by the use of corresponding vector layers of a small scale map for digitization of isoline and other regular systems of linear objects (such as the coordinate grid, or urban blocks with linear or radial planning). But in this case not all lines of a large scale map have small scale analogues. For example, the contour lines on a vectored 1:50,000 map may have 10m density while on the corresponding 1:250,000 map they have 50m density. In such a case, the contour lines that have counterparts in the generalization (0, 50, 100, etc.) are vectored first by the caterpillar algorithm. Next, the "stair" algorithm (the name reflects the shape of the illustrated algorithm) is applied for the recognition of the intermediate contour lines. The stair algorithm constructs a system of curves between each adjacent pair of already vectored contour lines, which are perpendicular to each of these contour lines. The density of these curves is defined by the curvature of the basic lines, just as in the caterpillar algorithm. Moreover, points of the adjacent contour lines to be searched for are located along the curves constructed in this way. Between two index contour lines, the number of additional lines to be found is well-defined (for example, between two contour lines of 100 and 150m four additional contour lines of 110, 120, 130 and 140m always exist and can be found). Once all the necessary reference points have been found, it is clear that they can be joined in succession using the program tools given by the caterpillar algorithm.

The sequence of reference points of a vectored linear object can be copied from the layer which contains the corresponding point objects. For example, shoreline structures (hydrometric monitoring posts, bridges, docks etc.) can be used as reference points to digitize the contours of rivers and lakes. The hydrometric monitoring posts are particularly useful here. Their coordinates and attribute data (name of the river or lake and altitude mark) can be used in automated recognition algorithms for the elements of the hydrological network on the raster map. Note that in this case automated digitizing reverses the order of operations compared to traditional techniques. Traditionally, the operator first digitized the hydrological network manually, and then vectored the location points of the shoreline structures using vector editing tools.

In other words, maximal usage of already existing information (directly or indirectly related to the vectored objects) employed as a general principle of automated cartographic image recognition can increase efficiency and reliability. For example, algorithms that use digital models of a region, and that are based on small scale maps, can be produced for digitization of the hydrological network. If the layers are already vectored, this can be used to generate the sequence of reference points of the curves to be recognized; otherwise these points can be indicated manually as described above. This simplifies automated digitization and increases its reliability.

Summarizing the processing of raster maps, we note that the methods and algorithms used for this process must provide complete, even redundant cartographic image recognition in order to eliminate erroneous recognition of objects, since visual control and correction of the vector layers can be carried out more quickly than manual digitization of missed objects.

To conclude the discussion in this section, we comment that the process of automated cartographic image recognition (processing), from our point of view, should follow the scheme presented in the following Table 7, where, as before, experts assign scores indicating the degree of possible automation of the various steps.

Operation

Score

  1. Development of strategy for automated digitization of raster maps
 

  • Elaboration of sampling matrices of raster maps
  •  

    1. Classification of recognized objects

    -

  • Selection of the size, pattern and color fill of basic sampling matrices of raster maps
  • 75

  • Estimation of statistical weights of separated elements of the cartographic images
  • 75

  • Recognition of cartographic images
  •  

    1. Digitization of objects which have vector analogues

    75

  • Digitization of objects which do not have vector analogues
  • 50

  • Elimination of superfluous recognized objects
  • -

  • Recognition of attribute data of vectored objects
  •  

    1. Classification of attribute information carriers

    -

  • Location and identification of attribute information
  • 75

  • Correction of errors in attribute data recognition
  • 75

  • Elimination of recognized images from raster map
  •  

    1. Restoration of image covered by recognized object

    75

  • Correction of restored image
  • 75

    Table 7. The degree of possible automation of the processing operations.

     

    [ Este número | Artículo]



    Dirección General de Servicios de Cómputo Académico-UNAM
    Ciudad Universitaria, M
    éxico D.F.