An XML Schema to Interpret Networked Biographies: Reading Mid-Range

paper, specified "short paper"
Authorship
  1. 1. Alison Booth

    University of Virginia

  2. 2. Worthy N. Martin

    Institute for Advanced Technology in the Humanities (IATH) - University of Virginia

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Collective Biographies of Women, is an open-access project supported by the Institute for Advanced Technology in the Humanities, Scholars’ Lab, and the English Department at the University of Virginia, as well as an ACLS Digital Innovation Fellowship. In recent years it has grown from an online bibliography of all English- language books that collect three or more short biographies of women into a digital prosopography that interrelates women, printed books, and narratives in what we call documentary social networks (introduced at DH 2013). CBW stands out as a literary study of prosopographies in the print era, and primarily the transatlantic nineteenth century (see the bibliography, http://womensbios.lib.virginia.edu). Most research that employs the term prosopography allies itself with history or classical and medieval studies, and today, relies on databases and websites. We work with the concept as it is often defined, as collective biography, that is, printed prose collections of short biographies (see the selective bibliography for a context on prosopography, nonfiction narrative, and our method of mid-range reading).
The CBW database associates some 8700 persons, 13,000 chapters (biographies), and more than 1200 books of various types published in English 1830-1940 (see developing database at http://cbw.iath.virginia.edu/cbw_db). Our project, however, is neither a textual archive nor a biographical database but an experiment in interpretation using the tools of DH to recognize the conventions of a genre, biography, and the history of gender conventions in a certain social context. Specifically, we want to get at the conditions of nonfiction, which generate multiple versions and cut and paste with relatively little respect for authorship. Could narrative theory of nonfiction be developed through a technique of digital markup that allows us to compare multiple versions of one life and interrelated types of person and text? With Daniel Pitti, Suzanne Keen, postdoctoral Project Manager Rennie Mapp, and teams of graduate assistants, we have developed and deployed a stand-aside XML schema, Biographical Elements and Structure Schema (BESS), in sample archives of digitized collective biographies that include designated individuals (e.g. all collections in our bibliography that include Caroline Herschel, the astronomer).
Briefly, BESS is an XML schema with a controlled vocabulary for narrative elements that appear in a given text:
StageofLife: before, beginning, middle, culmination, end, after, relative to the lifetimeofthebiography’ssubject
EventType e.g. illness, persona’s
AgentType e.g. mother, unnamed
Setting:
Location, e.g. city
Structure, e.g. school
Time: Dates, TimeofDay, Season
PersonaDescription e.g. physically daring
Discourse: e.g. retrospective, figureOrImage flower
Topos: e.g. influence, disgrace
Each editor in a trained team creates a separate XML file that in effect is an annotated outline, tagging types of elements identified in numbered paragraphs of a TEI file of the biographical narrative (from 3-100+ paragraphs).

BESS analysis enables us to compare versions of the same person’s life. When we have analyzed all versions in our corpus, we give unique ID numbers to the essential events in all versions (kernels) and the more or less common optional events (satellites, common or rare), and can compare the placement of these in the versions, much as folklorists have charted the variations on the main events of a tale. (Narrative theorists have developed analyses of events in these terms, but not for nonfiction.) BESS analysis reveals differences in narrative technique in books that take different perspectives on women’s roles and that select persons of different types. Thus, beyond the literal level of actions (events), we can measurably correlate, for example, the instances of direct address, use of ‘we’ alongside not only the topos (i.e. situation; underlying scenario) work as social service but also the topos temptation of status or goods. The conjunction of different elements in these biographies often challenges our own later assumptions about historical women and gender norms. As BESS work is completed, we expand a body of data that for the first time documents the distinctive characteristics of third-person narratives about real people.

Fig. 1:
Currently working with a web designer, we have a sustainable, accessible database that functions well for team workflow, with parallel display of text and BESS analyses. We plan to develop the visualizations of BESS beyond current designs of tables and graphs.

Fig. 2:
The CBW project’s BESS “reading” of many narratives is time-consuming and detailed, as in many literary digital projects. As in all DH, we encounter challenges when visualizing quantities of variable data, an issue that this paper merely touches upon. Our aim, instead, is to introduce and make a case for the mid-range approach of BESS, with some reference to other possible approaches.
Many methods of text “reading” en masse might be useful with the CBW books. Broadly, options range from word strings and topic clusters across a large corpus of digitized texts to systematic encoding of all textual features and variants in an onlineß edition of an archive (e.g. Online Froissart; the Rossetti Archive). On the vast end of the scale, we recognize the astounding range of a Google N-gram kind of data capture as well as the precision of some text-mining projects. On the closer focus of the scale, we think human curation is best suited for patterns of narration and ideology, and we begin with books and place them in a context of genre and publishing history. Thus we try to benefit from the precedents of literary editions, and yet CBW is not a project in textual editing. We have no wish to fine-tune exact digital surrogates of these books. The BESS schema and approach to many-versioned biographies within social and historical contexts is designed to moderate between distant and close reading—a comprehensive digital model of variations within a genre and fine textual details.
Many in DH have addressed the question of what to do with a million books. Franco Moretti has espoused distant reading, a term applicable to many kinds of directed and unsupervised queries in big archives. This is understood as opposed to "close reading," the usual literary method (without computational mediation), one text, one person. The findings of singular textual analysis are less appropriate when describing patterns across a genre, especially of nonfiction where there are many versions of the “same” narrative and the author is less important than the protagonist, the representation of a real person. Thus, our method has some affinity for Sharon Marcus and Stephen Best’s concept of “surface reading,” as advocated in their manifesto to put a stop to the required "critique" or theoretical digging into a text for what it does not say for ideological purposes. Yet we retain a framing conceptual commitment to ideological critique, as we want to know about the changing gender ideology and historical contexts for women’s lives. More directly, we are pursuing the kind of "social reading" promoted by Alan Liu, as we hope to extend BESS as a tool available for other projects in digital interpretation of biographical narratives. BESS, interlinked with a database that reveals networks among historical persons and books about them, recruits computation to aggregate the interpretations of readers to parse a genre.
We call for a new metaphor or spatial model for hybrid methods of mid-range digital interpretation, whether using a stand-aside markup like BESS or other approaches. Our editing teams “hover," not at satellite level, but like balloon aerial digital cameras scanning a neighborhood, producing images that can zoom in and out. Such records do much less harm to the person, the individual record, than surveillance or drones. Our schema functions, alternatively, like GPR, "ground-penetrating radar," used to detect buried structures two or three feet into the ground. Although such metaphors for our mid-range reading method with BESS have the inherent comedy of balloons and robot- like go-carts, we can seriously enhance what we know above or beneath the surface without destroying it—without murdering to dissect. I remind the BESS team that the texts are always still awaiting any method of interpretation, unmangled after we have subjected them to our skewed adaptation for prosopographical purposes. We’re not pretending to let machines discover without distortion. Nor are we clinging to the requirement that reading is an individual act—on the contrary. Like reading, biographies trope toward the collective and typological, the mid-range, even in monographic form. Inviting open-access play, we expect to be surprised by the details in the mosaic or wave- like picture from above or below.
References

Booth, A. (2004). How to Make It as a Woman: Collective Biographical History from Victoria to the Present. Chicago: University of Chicago Press.
Booth, A. (2005). “Fighting for Lives in the ODNB, or Taking Prosopography Personally,” Journal of Victorian Culture, 10: 267-79.
Booth, A."Prosopography." The Encyclopedia of Victorian Literature, ed. Dino Felluga, Pamela Gilbert, and Linda Hughes (Wiley Blackwell), forthcoming.
Bradley, J. and H. Short. (2002). “Using Formal Structures to Create Complex Relationships: The Prosopography of the Byzantine Empire—A Case Study.” In K. S. B. Keats-Rohan (ed.), Resourcing Sources Prosopographica et Genealogica, vol. 7. http://prosopography.modhist.ox.ac.uk/publications.htm.
Cameron, A., ed. (2003). Fifty Years of Prosopography. Oxford: Oxford University Press.
Oldfield, S. (1999). Collective Biography of Women in Britain, 1550-1900. London: Mansell.
Keats-Rohan, K. S. B. (2007). “Biography, Identity and Names: Understanding the Pursuit of the Individual in Prosopography.” Prosopography Approaches and Applications A Handbook. Oxford: Occasional Publications UPR. 139–81. Prosopographica et Genealogica 13.
Stone, L. (1971). “Prosopography.” Daedalus 100: 57-9.
Bauer, J. Project Quincy. http://projectquincy.rubyforge.org/
Brown, S., with Clements, Grundy, et al.Orlando: Women’s Writing in the British Isles from the Beginnings to the Present. Cambridge: Cambridge University Press, 2006. http://orlando.cambridge.org/
Clergy of the Church of England Database. http://www.theclergydatabase.org.uk/index.html
Liu, A. Research Oriented Social Environment (RoSE) http://rose.english.ucsb.edu/
Perdue, S. People of the Founding Era. http://documentscompass.org/projects/pfe/
Pitti, D.Social Networks and Archival Context Project (SNAC). http://socialarchive.iath.virginia.edu/
Prosopography of the Byzantine World. http://www.pbw.kcl.ac.uk/
The Prosopography of the Neo-Assyrian Empire http://www.helsinki.fi/science/saa/pna.html

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO