|
||
|
Harmonizing census data is not a new idea. First proposed in 1872 at the International Statistics Congress held in St. Petersburg, not much progress was made until the last half of the twentieth century. One of the signal achievements of the United Nations Statistics Division has been in the international harmonization of census concepts from the enumeration form to the publication of final tables. While incomplete, the effort has enjoyed widespread support by statistical agencies around the globe. Beginning in 1991, the IPUMS-USA project has worked to harmonize census data for the United States for the period since 1850, and IPUMS-International has capitalized on this experience. International census samples employ differing numeric classification systems and reconciliation of these codes is a major part of this project. Variables must be easy to use for comparisons across time and space. This requires that we provide the lowest common denominator of detail that is fully comparable. On the other hand, we must retain all meaningful detail in each sample, even when it is unique to a single dataset. For most variables, it is impossible to construct a single uniform classification without losing information. Some samples provide far more detail than others, so the lowest common denominator of all samples inevitably loses important information. Composite coding schemes offer a solution. The first one or two digits of the code provide information available across all samples. The next one or two digits provide additional information available in a broad subset of samples. Finally, trailing digits provide detail only rarely available. For example, in IPUMS-International, the first digit of the variable for marital status is comparable across all samples. The second digit delineates consensual unions from other forms of marriage (where appropriate) and distinguishes among the categories separated, divorced, and married with spouse absent. The final digit provides additional detail with the married and married-spouse-absent categories (such as polygamous marriages in Kenya). The basic goal of our harmonization efforts is to simplify use of the data while losing no meaningful information. In addition to providing harmonized codes for variables and accompanying documentation, the IPUMS-International project is carrying out a variety of additional tasks to improve data quality, not all of which have been implemented in this preliminary release of the data. These tasks include the following:
|