Chinese Characters Chinese Characters

Construction of the County Boundary Database of China

A working document for the CITAS-CASM Collaborative Project

Qin Tang

December 27, 1994

I. Goal

The goal of the project is to facilitate the reconstruction of the county boundary map of China at any given date between 1982 and 1992, and subsequently to allow the reconstruction of boundary maps of provinces. The map should contain not only the geometric information on the boundaries, but also the corresponding date-specific GB codes and other important information (name, status ...).

II. Database Structure

The database, developed in the ARC/INFO environment, consists of three basic components: a Composition Map (CM), a historical Guo Biao Code Table (GBCT), and a Transition Table (TT) (Fig. 1).

The Composition Map is polygon-oriented. It records all the boundary information including historical changes. Each polygon on the map represents a minimum unit that belongs (or did belong) to one and only one administrative unit at any given time. It may belong to more than one administrative unit over the time span. Whenever a boundary change takes place, it will create a new polygon on the map (Fig. 2). Each polygon has a unique identification number and has one or more entries in the Transition Table.

Although polygons on the composition map are the information that is used to reconstruct the boundary map, information on the boundary type, such as county boundary or provincial boundary, should also be recorded. Most likely, those arcs added to reflect the historical boundary changes would be county boundary. With the information on the arcs, it will be easier to draw boundaries at different levels, e.g., provincial boundaries, and district (Di Qu) boundaries. We may encounter some problems here, for instance, if a boundary was a district-county boundary at one time but became a county boundary only at another time. It is possible to deal with arcs in a manner similar to polygons. This work may be done a later stage.

The GB Code Table records all the GB codes ever used for the time period covered by the database. Each entry in the table reflects a unique status of an administrative unit. Each entry contains such items as code, name, administrative status, position of the unit in the administrative hierarchy, type of change (s), the GB code and name of the prefectural-level unit to which this unit belongs, from_date, and to_date. The from_date and to_date indicate the effective period of the code (Fig. 3). If the creation of such a comprehensive table conflicts with the work schedule, we may start with a simpler GB Code Table (including only five fields: code, name, status, from_date, and to_date) and then expand the table to include all of the items mentioned here at a later stage.

Usually, the GB codes reflect all the administrative changes but the boundary changes -- name changes, status changes, and changes in hierarchical structure. A boundary change may not lead to any code change. For instance, part of a county may be reassigned to a neighbor city, but both the county and city keep their own codes. In most cases, when a change took place, a new GB code was assigned to the unit. In a few cases, when the name of a unit was changed, no new code was assigned. Even in these cases, the GBCT should record such changes, so that correct names can be retrieved. Therefore, each entry in the table corresponds to an administrative change. When a unit has only one GB code, but has had a name change, it should have two entries in the table, one for each status (name). At any given date, however, only one code for each unit is valid.

The Transition Table serves as a bridge connecting the composition map and the code table. Each entry in the table represents an association between a polygon on the composition map and a GB code in the GB Code Table, as well as the time stamps that identify when that polygon was associated to the code. Each polygon on the map may have one or more entries in the table, with each entry indicating one association. It has four items and is indexed by the polygon identification numbers on the map (Fig. 4).

The number of entries in the table for any polygon on the map depends on the number of administrative changes ever affecting that polygon. For instance, if a polygon has never experienced an administrative change, that polygon will have only one entry in the table. If a polygon has belonged to two different administrative status during the period covered by the database, it will have two entries in the table, one for each status. At any given date, each polygon has only one valid entry in this Transition Table.

III. Reconstruction Procedure

To reconstruct a boundary map from the database, a date must be specified because administrative changes take place continuously. The boundary map at the start of a year is different from one at the end of the year. The following describes briefly the procedures for reconstructing a boundary map from the database, followed by a detailed algorithm to implement these procedures.

Procedures

(1) Extract the entries from the Transition Table that are valid for the given date to create a polygon_code table. Each polygon on the composition map should have one and only one entry in this table.

(2) Relate the polygon_code table to the composition map. Some adjacent polygons will have the same code, meaning that those polygons belonged to the same administrative unit on that date.

(3) Combine the polygons with the same code into one single polygon. Rebuild the polygon topology. This will produce the boundary map requested.

(4) Extract all the GB codes and desired items for the given date from the GB Code Table to create a code_for_the_date table.

(5) Relate the code_for_the_date table to the reconstructed boundary map.

Figures 5(a) through 5(e) illustrate the procedures.

Algorithm

Get the date from the user, check if it is valid

-- not valid, return;

-- valid, continue

Copy TT structure to an empty table plycdtbl

For al records in TT, test

If From_date <= date and To_date >= date

Copy the record from TT to plycdtbl

Delete item From_date and To_date from plycdtbl

Copy CM to CM_date

Join plycdtbl with CM_date attribute table

-- Primary key for join is polygon-id

Dissolve common boundaries on CM_date using GB codes

Rebuild CM_date polygon topology

If desired, create cfdtble from GBCT to add more items to CM_date attribute table

Copy GBCT structure to an empty table cfdtbl

For all records in GBCT, test:

If From_date <= date and To_date >= date

Copy the record from GBCT to cfdtbl

Delete item From_date and To_date from cfdtbl

Join cfdtbl with CM_date table

-- Primary key for join is GB code

Delete items unwanted from CM_date table

IV. Constructing CM

The Composition Map contains all the spatial information (including historical changes) on boundaries. It consists of polygons. Each polygon represents a minimum unit that has belonged to one and only one administrative unit at any given time. The following describes a few guidelines on the construction of the map. Finding efficient means to implement them is not a simple task and will require some experimentation.

The first step is to collect documents on administrative boundaries, especially on boundary changes. We assume that CASM can accomplish this. With the information, we can identify which counties have experienced boundary changes. Since we work backward in time, if two units were created from one unit, we have two minimum units already. If one administrative unit was formed by combining two or more units or parts of units, we need to create new minimum units by adding arcs to the existing boundary map because this means that parts of that administrative unit have belonged to two or more units.

The second step is to add necessary arcs to create minimum units all over the map. We take the base map (a mylar copy is preferred in order to reduce spatial distortion caused by paper stretching and folding) and draw the historical boundaries on the map. This will produce many polygons. A unique polygon identification number must be assigned to each newly formed polygon. We use the old polygon numbers for any one part of an existing polygon when it is divided into two or more polygons. After all boundary changes are drawn on the map and polygon IDs are assigned, a base map is completed.

The last step is to convert the Composition Map (base map containing minimum units) into digital data file(s). How to do this depends on the base map quality and original digitizing controls. If the base map has exact control points that match well with those of the existing digital map, we can use those points to register the base map sheets and digitize those lines that are added in the second step using a digitizer. If this approach does not work, we may try to visually digitize these lines on the screen. In this case, the base map with the boundary changes must be draw at a much larger scale, for example 1:500,000. The digital map can then draw at the same or larger scale. The preferred approach needs to be determined through experimentation. Both methods may be used in different situations.

V. Creating GBCT

The GB Code Table contains comprehensive information on the administrative units. It is code-indexed and consists of GB Code, Name, From_date, To_date, and other items.

CITAS has done substantial work on creating such a table. The detailed description of the table contents and the procedure used in creating the table is attached. The present table needs some revision and verification before it can be used in this project.

VI. Creating TT

Creating the Transition Table is a time-consuming job. It is done by combining information from the Composition Map and the GB Code Table. The polygons on CM have to be matched with the information in the GBCT. The relationship is many-to-many. One polygon on CM may be related to several records in the GBCT. On the other hand, one record in GBCT may be related to several polygons. The temporal information for each code, that is, From_date and To_data, can be obtained from the GB Code Table.

To start the work, we can create an initial table from the current boundary file. The table would have information on existing polygons and their status. We then add entries to these polygons (they are administrative units at this moment) that have experienced code changes. For example, if polygon 101 has current code of 322101, but its previous code was 322123, then in the Transition Table, it should have two entries, one for 322101, the other for 322123. Since very few units have ever had a code change, most of these polygons would have only one entry in this table.

After these "old" polygons have been processed, we then work on those newly formed polygons, resulted from adding arcs. These polygons have two or more entries, one for the "current" situation, the others for previous codes they were associated with.

The tables are not available at this time. Please stay tuned.


Last modified on 12 May, 1997 Return to CITAS Homepage