Applying the TEI: Problems in the classification of proper nouns

multipaper session
  1. 1. Julia Flanders

    Brown University, Women Writers Project - Brown University

  2. 2. Sydney Bauman

    Brown University

  3. 3. Paul Caton

    Brown University

  4. 4. Mavis Cournane

    University College Cork

  5. 5. Willard McCarty

    King's College London

  6. 6. John Bradley

    King's College London

Work text
Keywords: name, TLH, WWP, TEI

The testing of the TEI Guidelines since their release has thus far taken a somewhat private form. Scholarly text encoding projects have availed themselves of the Guidelines' exceptional richness and nuance, but with the aim of doing the greatest possible justice to the complexity of their own data, or the particular needs of their own users, rather than with any concern for developing consistency between projects. To a certain extent this is justifiable; the point of a flexible standard is precisely that it can accommodate the multiple needs of its various users. However, where divergence is only the result of random choice among equivalent options, rather than being motivated by real constraints, it serves no purpose and only impedes the exchange and use of data. Now that the Guidelines have been in use long enough to create a substantial base of encoded data, projects whose source material and encoding strategies are similar can benefit from comparing approaches to common problems, and assessing whether their divergences are justified by differences in data or philosophy, or merely represent unnecessary variation in the application of the TEI.

One area of primary source transcription which deserves examination along these lines is the classification of proper nouns and similar words and phrases, using the elements described in Chapter 20 of the TEI Guidelines: <name>, <rs>, and the suite of more specific elements such as <placeName>, <orgName>, <foreName>, <surName>, <roleName>, etc. These elements describe a set of phenomena whose retrieval and processing are important to the scholarly user of the encoded text, but whose boundaries are quite fluid and often involve the application of theoretical considerations quite unrelated to text encoding. (For example, is "God" a personal name?)

The proposed session will present several perspectives on this problem, with several aims: first, of allowing the participating projects (and those represented in the audience) to compare practices and discuss the status of their variation; second, of situating the specific problem of encoding proper nouns within the context of scholarly analysis, so as to create a more precise sense of the needs which the encoding is intended to address; and third, to think more broadly about the pressures and constraints on classification systems in text encoding.

Two of the three papers in this session come from encoding projects which use the TEI, and which have used Chapter 20 in particularly detailed and carefully considered ways. The first of these is the Brown University Women Writers Project, in a paper co-authored by Julia Flanders, Paul Caton, and Sydney Bauman, which will address the WWP's approach to the use of Chapter 20, and its attempt to balance scholarly needs and cost-effectiveness. The second is the Thesaurus Linguarum Hiberniae (TLH) project, discussed by Mavis Cournane, who will examine TLH's use of TEI to classify and specify different kinds of proper nouns within TLH's corpus of writing in Ireland. The last paper, by Willard McCarty and John Bradley, will consider these issues from the perspective of a non-TEI project dealing intensively with names and their classification, An Analytical Onomasticon to the Metamorphoses of Ovid. This paper will discuss the encoding of names in relation to the complex issues of literary criticism and analysis, with an in-depth exploration of examples from the Metamorphoses.

