Information Discovery in the Chinese Recorder Index

  1. 1. Jieh Hsiang

    National Taiwan University

  2. 2. Jung-Wei Kong

    National Taiwan University

  3. 3. Allan Sung

    National Taiwan University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

the chinese recorder
Chinese Recorder (CR) is a (first bi-monthly, then monthly) journal published by the Protestant missionaries in China between 1867, a few years after the 1860 treaty that allowed missionaries to enter China, and 1941, when the U.S. became engaged in the Pacific theater of the Second World War. Except for the nineteen-month interruption from June 1872 to the end of 1873, CR was the longest running English missionary journal in China. The period that CR covered was a tumultuous time in Chinese history, when the country went through the Taiping Rebellion, the Boxer Rebellion, the various wars with the foreign powers, the Republic Revolution of 1911, the civil wars among the war lords, the rise of communism, the invasion of Japan, and the great cultural and social transformation during the late 19th and early 20th century. Being based within China, CR provided a close look at all spectrum of the Chinese society, not only missionary affairs, but also issues such as Chinese civilization, healthcare, education, political situation, opium, and other social issues of the day. CR is unique in that the articles were written by missionaries in China for the benefit of their fellow missionaries. Being supported by missions and not sponsored by any government, Chinese or foreign, the views presented in CR were more candid and not blurred by political agenda. It, thus, provides an angle unlike any others. Although one cannot say that CR is not biased, at least it is biased in its own special way.
the chinese recorder index
With its 73 volumes, one for each year, and over 50,000 pages in total, CR is difficult to use on its own. In 1986 Kathleen Lodwick published the 2-volume The Chinese Recorder Index: a Guide to Christian Missions in Asia, 1867–1941 (CRI), which made it much easier for scholars to utilize CR in their research (Lodwick 1986).
CRI is more than an index. It consists of 3 indices and 6 special lists (of Persons by Affiliation, Persons by Location, List of Women, etc). The Person Index (PI) includes 8,391 names with 192,149 page records; theMission/Organization Index (MI) 712 mission entries with 34,851 page records; and the Subject Index 4,691 entries with 83,636 page records. Altogether, there are 13,794 entries and 310,626 page records, averaging 22.5 page records per entry.
Unlike a conventional book index that only provides the pages in which an entry appears, CRI assigns tags to PI and MI, thus provides additional information about the nature of the occurrence of the entry on that page. Tables 1 and 2 give the names, nature, and numbers about the tags that we have tabulated.
There is a great wealth of information hidden in the tags. For example, if a certain page appears in a person name entry A, and if that page also appears in another person name entry B under an ART (article) tag, then we know that A appears in an article written by B. Using the same page number to check all entries, and we can find all person names, locations, missions, subjects, etc that appear in the same article. If the same page appears in a Subject Index entry indicating a certain event, then we may even ‘guess’ what the article is about without reading the article itself.
As another example, the ATT (attacks) tag tells how many attacks of missions and missionaries had been reported in CR, who were attacked, in what years did they occur and where. This information should be valuable to someone studying the attitude of the Chinese society towards Christianity when it was re-introduced to China in the late 19th century. However, since ‘attack’ is not an entry by itself, this information is scattered all over CRI. One has to pore through the 13,000 entries to collect every relevant bit of information.
Table 1: Person Index
Tag name of Person Index Tag value number/no value explanation
AFF (Affiliation) Name of mission 27,678 / 14 Affiliation
ARR (Arrival) Year 8,705 / 6,283 Arrival year (if no value, then it’s indicated by volume number)
ART (Articles) 11,125 Article written by person
ATT (Attacks) 1,164 Attack on the person
CHI (Children) 7,765 Children of the person
CON (Conferences) Location, year 3,449 / 1,793 Conference attended
COR (Correspondence) 2,276 Correspondence to the editor
DAT (Dates in China) Year 139 / 2 Dates in China
DEA (Death) Precise year 2,459 / 2,035 Death (if no value, then it’s indicated by volume number)
DEP (Departures) Year of departure 5,496 / 5,000 Departure in the year(if no value, then it’s indicated by volume number)
ITI (Itinerancy) Location 445 / 413 Itinerancy
LOC (Location) Location 39,744 / 8 Location
OTH (Other Publications) 10,095 Other publications
POS (Positions) Title and mission 6658 / 2 Position of the person
SPO (Spouse) 13,212 Spouse of the person
UNS (Unspecified) 51,739 Unspecified
Table 2: Mission/Organization Index
Tag name Tag value Number/no value explanation
ATT (Attacks) 275 Attacks on the M/O
CON (Convert) 996 Converts
FIN (Finances) 1,585 Finances of the M/O
HIS (History) 650 History of the M/O
HOS (Hospitals) 639 Hospital
MEE (Meetings) 2,340 Meeting
LOC (Location) Location 9,533 / 32 Location
OPR (Opium Refuges) 0 Opium refuge
ORD (Ordained Asians) 349 Asians ordained by mission
ORP (Orphanages) 32 Orphanages
PER (Personnel) 1,520 Personnel
PRE (Press) 1,554 Press
REP (Reports) 772 Reports by the M/O
SCH (Schools) 1,309 School run by the M/O
STA (Statistics) 2,510 Statistics about the M/O
UNS (Unspecified) 10,763 Unspecified
reindexing the cri
To fully utilize the wealth of information embedded in CRI, we developed a system that incorporates the 3 indices of CRI into one uniform framework, under which the indices are fully integrated and cross referenced. This integration enables the user to explore the rich information hidden within CRI.
Our approach starts with a data structure that decomposes an index entry into a number of page records, each consists of the entry name (n), a volume number (vol) and page numbers (the start page and the end page, s and e), a tag (t) and a tag value (v). (The tag part may not be present if it is not indicated.) Thus, a page record is a tuple (n; vol:s-e; t:v). If there is only one page indicated, then the start page and the end page will be the same. For instance, the first page record in the George Leslie Mackay entry (Figure 1) is (Mackay, George Leslie; 13:74-74; AFF: CPM) and the second page record is (Mackay, George Leslie; 13:312-312; AFF: CPM). (The George Leslie Mackay entry as shown in this example becomes 46 page records.) This process will decompose the 13,794 entries in the three indices into 310,626 page records. We then designed algorithms to mine the relationships among the page records, mainly using the page numbers and the tags as reference points (Kung 2011). While the detail of the algorithms cannot be covered in this abstract, we will use two simple examples to demonstrate the outcome.

Figure 1: The ‘George Mackay’ entry in Person Index
One of the page records in the Person Index entry of George Mackay is (Mackay, George Leslie; 16:214-214; LOC: Fukien, Amoy). Since in the entry of Thomas Barclay, there is a page record (Barclay, Thomas; 16:214-215; ART), we know that Mackay was mentioned in an article written by Barclay. Other page records show that the same page appeared in the Subject Index of Sino-French War (1884–1885), the Person Index of James Maxwell, William Thow, etc. (Figure 2), plus others. Thus, when issuing George Mackay as a query, instead of simply returning Vol. 16, p. 214 as one of the pages in which Mackay appears, all information in CRI about that page will be organized and returned (Figure 3). (Indeed, it was an article written by Barclay about the returning of the missionaries from Amoy to Taiwan after the lifting of the blockade of Taiwan at the end of the Sino-French War.) Figure 4 shows the webpage resulting from the query ‘George Mackay’. Note that a chronological distribution is presented and a foldable/expandable classification of the return is on the left.

Figure 2: Some entries containing Vol. 16, p. 214

Figure 3: Information returned about Vol 16, p214 when given query ‘George Mackay’

Figure 4: Query result of ‘George Mackay’
Figure 5 shows the return of the query ‘attack’, for which a total of 574 different pages in the entire CR are retrieved. (There are all together 1,439 page records with the ATT tag. But some referred to the same pages.) The peak occurred in 1900, during the Boxer Rebellion, which may not be surprising. However, the post-classification of the query result (Figure 6), an important feature of our system, shows the lists of authors, persons, missions, which might provide interesting information for scholars to pursue further.
The method and system that we have developed provide a more global view of both CR and CRI. It integrates the three indices of CRI and thus reveals relations among them and information implicitly embedded. Our work should make CR and CRI more accessible to the research community. The approach that we have taken is also a general one that can be applied to any index with a similar structure.

Figure 5: summary of query ‘attack’

Figure 6: Classifications of the query result of ‘attack’ according to Author, Person, Mission, and Subject
Kong, J.-W. (2011). Design and Implementation of a Retrieval System for the Chinese Recorder Index. MS Thesis, National Taiwan University.
Lodwick, K. (1986). The Chinese Recorder Index: A Guide to Christian Missions in Asia, 1867-1941. Scholarly Resources Inc. Washington D.C.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2012
"Digital Diversity: Cultures, languages and methods"

Hosted at Universität Hamburg (University of Hamburg)

Hamburg, Germany

July 16, 2012 - July 22, 2012

196 works by 477 authors indexed

Conference website:

Series: ADHO (7)

Organizers: ADHO

  • Keywords: None
  • Language: English
  • Topics: None