Inspired by Vincent Arel-Bundock’s well-known
countrycode, we created
regioncode to achieve similar functions specifically for China studies.
regioncode enables seamlessly converting regions’ formal names, common-used names, and administrative division codes between each other.
The Chinese government gives unique geocodes for each county, city (prefecture), and provincial-level administrative unit. The so-called “administrative division codes” were consistently adjusted to matched national and regional plans of development. Geocode adjustments disturb researchers when they merge data with different versions of geocodes or region names. Especially, when researchers render statistical data on Chinese map, different geocodes between map data and statistical data may cause mess-up data output or visualization.
The package is developed to conquer such difficulties to match regional data across years more conveniently and correctly. In the current version,
regioncode enables seamlessly converting formal names, common-used names, and division codes of Chinese prefecture regions(named ‘地级市’ in Chinese) between each other and across thirty-four years from 1986 to 2019.
The toy data was created based on a chunk from Yuhua Wang’s
China’s Corruption Investigations Dataset. The data includes information on almost 20,000 officials who were investigated during Xi Jinping’s anti-corruption campaign. We randomly drew an eighteen-line sample. The division codes in the original data were based on the 2019 version. We kept the variables of prefectural names and division codes. We added a column of to short names of the prefectures to further illustrate how the software works.
regioncode package, we named administrative division codes as
code, regions’ formal names as
name, and their commonly used short names as
sname. The current version enable to convert any pair of them mutually:
regioncoce function accept numeric and character vectors as the input division codes and region names respectively. To achieve an accurate conversion, users have to specify the year of the source data correctly in the argument
year_from. Then they can set the year they want the output is. That’s it. See the following example to convert the 2019-version codes to the 1999 version:
In some cases, the original data may only have division codes or region names, but users needs the other form or both formats of data. In such cases,
regioncode offers a function to convert division codes from any year to region names in any year. Users only need to alter the converting method, for example, to “code2name” in order to convert division codes to region names.
Similarly, one can get the code from names, or in a less-often case get the names in a different year from the names from a given year. Users need to change the
method argument to “name2code” or “name2name” to achieve these conversions.
regioncode provides two advanced functions to achieve more complicated conversions. One of the occasions occurs when the data source includes only common-used short names of the cities instead of the full, official ones.
regioncode can still accomplish the conversion in this case when the users specify the
incompleteName to “from”. (
regioncode can also produce short names from inputs of full or short names and division code. See the Details of the help file for more information.)
Another advanced application involves in the case when the municipalities directly under the central government (“zhixiashi” in Chinese Pinyin). This is common for national survey data.
regioncode can fit this case with no problem as long as the user sets the argument
zhixiashi as TRUE.
# In the sample data, the division code of municipalities were coded as NA. Filling the codes of municipalities with their provinces' codes. code_zhixiashi <- c("110000", "120000", "310000", "400000") corruption <- corruption %>% mutate(prefecture_id = ifelse(province_id %in% code_zhixiashi, province_id, prefecture_id)) # Converting regioncode(data_input = corruption$prefecture_id, year_from = 2019, year_to = 1999, zhixiashi = TRUE)
regioncode rovides a convenient way to convert Chinese administrative division codes, official names, and common-used short names between each other. This vignette offers a quick view of package features and a short tutorial for users.
The development of the package is ongoing. Future versions aim to add more administrative level choice, from province level to county level. Data are also enriching. Please contact us with any questions, bug reports, and comments.
Dr. Yue Hu
Department of Political Science,
Department of Political Science,