Using METS to Document Metadata for Image-Based Electronic Editions

paper
Authorship
  1. 1. Dorothy Porter

    Libraries - University of Kentucky

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Project: Electronic Boethius

(http://beowulf.engl.uky.edu/~kiernan/eBoethius/inlad.htm)

In October of 1731, a fire swept through the library of the aptly-named Ashburnham house in London, destroying and damaging many of the holdings originally collected by Sir Robert Cotton in the early seventeenth century. Among the artifacts affected by the fire was a tenth-century manuscript, the only known copy of an Old English dual verse and prose translation of the sixth-century Consolation of Philosophy.

The Consolation had a rich history throughout the Middle Ages. It was first translated into English by King Alfred the Great, who ruled Anglo-Saxon England from 871-899 and was responsible for the translation of several important Latin texts during his reign. Chaucer made his own translation of the Consolation, as did Queen Elizabeth I. It was Alfred's translation that was damaged in the Ashburnham House fire.

Although this tenth-century manuscript (now held by the British Library, Cotton Otho A. vi) was damaged to the point where it was nearly illegible, there are several reasons why the destruction was not a total loss for scholarship. There is one other complete manuscript holding Alfred's translation of Consolation, the Oxford, Bodleian Library MS Bodley 180, although it is a later twelfth-century, prose-only version of the text. More remarkable is that in the seventeenth century, Dutch antiquarian and scholar Francis Junius made a copy of Bodley 180 and collated it with the yet undamaged Otho A. vi, preserving the readings that are different between the two manuscripts. Up until now, this transcript and collation (Bodleian Library MS Junius 12) has been the only source for those parts of Otho A. vi that are damaged to the point of illegibility. There are editions of the Old English Consolation that have been published since the fire, but notably of the two standard editions (both based on the tenth-century manuscript), one edition (Sedgefield's 1899 edition of King Alfred's Anglo-Saxon Version of Boethius, de Consolatione Philosophiae) is entirely in prose, while the other (Krapp's 1932 edition of the Meters of Boethius) is entirely in verse.

In recent years, technological advancements have enabled access to Otho A. vi that would have been unimaginable in 1731. Using ultraviolet light and high-resolution digital photography, we are now able to read portions of text that have not been accessible for nearly 270 years. Taking advantage of these effectively new texts, with the support of the National Endowment for the Humanities and the Mellon Foundation we are creating an image-based electronic edition of the Consolation, using Otho A. vi as our base text. Understandably, the final edition will consist of a variety of files including:

Daylight and UV images of Otho A. vi
Daylight images of Bodley 180 and Junius 12
The prose and verse text of Otho A. vi
Supplemental text from Bodley 180, Junius 12, and earlier editions, for those portions of text that are missing or still illegible
Pervasive and complex XML encoding that describes many different aspects of the text and manuscript including:
Basic codicological features (quires, folios, lines)
Basic textual divisions (prose and verse sections, verse and prose lines, half-lines, words)
Editorial emendations
Metrical information
Paleographic descriptions of individual letters
Description of the condition of the damaged manuscript
Search Tool: for searching XML encoding and text
Metadata Requirements for the Edition

Because the final edition will be very complex, we wanted to develop a schema for metadata documentation that would simplify the project and not make it more complicated than it is already. There were three issues in particular that we needed to address before deciding on an approach to metadata documentation.

Granularity: to what level should the edition be described?

In many ways, the question of granularity was the most important, and most difficult, issue to address. On one hand is a MARC record, a relatively simple description of the edition as a whole, including only basic bibliographical information. Though simple, this approach was just not realistic given the complexity and fragility of the edition's contents. On the other hand is a completely pervasive approach, with a TEI header for every text and technical data for every image file. We finally decided to err on the side of caution and provide metadata for every text and image file, but to also provide higher-level description as well.

Intellectual Rights

Because various elements of the Electronic Boethius are controlled by different parties (the manuscript images, for example, belong to their respective libraries), we wanted to be able to describe clearly which portions of the edition belonged to whom.

Organization and Software

Given the complexity of the elements of the edition and their relationships with one another -- multiple versions of folio images; folio images in relation to text and markup -- we wanted to be able to express these relationships somehow in the metadata. When someone views the metadata file, it should be clear what files are in the edition, where they are in relation to the other files, and how they all interact. Since we also have some interactive software included in the edition, we wanted to be able to describe that as well.

Once we made our decisions concerning these issues, we decided to use the Metadata Encoding & Transmission Standard (METS) to incorporate the metadata we need to describe the elements of the edition, and the edition as a whole. Although METS was almost exclusively designed for implementation by libraries, archives, and digital resource repositories, and not by individual scholars, we decided that since it addresses all our documentation concerns, we would use it to document the Electronic Boethius. This presentation will illustrate how we are using METS to document metadata for the image-based electronic edition.

We will visit the various metadata schemas that we are using to describe the elements of the edition: for text, the Schema for Technical Metadata for Text (http://dlib.nyu.edu/METS/textmd.htm); for image technical metadata, NISO Metadata for Images in XML Schema (http://www.loc.gov/standards/mix/); for descriptive metadata, the Metadata Object Description Schema: http://www.loc.gov/standards/mods/; for intellectual rights information, the Schema for Rights Declaration (http://www.loc.gov/standards/rights/METSRights.xsd).
We will give examples of how we are taking advantage of METS' file section and structural map sections to describe how the elements relate to one another
We will give examples of how we are using the behavior section to describe our delivery software and how it relates to the files within the edition.
We hope to use METS to document the Electronic Boethius as completely and thoroughly as possible. If our original files were to suffer a second Ashburnham House, our metadata should be complete enough to rebuild the edition -- as we are re-creating the original manuscript.

Further Reading:

1. Digital Library Federation. Metadata Encoding & Transmission Standard (METS): An Overview & Tutorial, October 2001. <http://www.loc.gov/standards/mets/METSOverview.html>.
2. J. McDonough, L. Myrick and E. Stedfeld. Report on the Making of America II DTD: Digital Library Federation Workshop, February 2001. <http://www.diglib.org/standards/metssum.pdf>.
3. National Information Standards Organization (NISO). Data Dictionary: Technical Metadata for Digital Images, July 2000. <http://www.niso.org/pdfs/DataDict.pdf>.
4. OCLC/RLG Working Group on Preservation Metadata. Preservation Metadata for Digital Objects: A Review of the State of the Art, January 2001. < http://www.oclc.org/research/projects/pmwg/presmeta_wp.pdf >.
5. OCLC/RLG Working Group on Preservation Metadata. Preservation Metadata and the OAIS Information Model. A Metadata Framework to Support the Preservation of Digital Objects. June 2002. < http://www.oclc.org/research/projects/pmwg/pm_framework.pdf >.
6. Commission on Preservation and Access (CPA) and Research Library Group (RLG). Preserving Digital Information: Report of the Task Force on Archiving of Digital Information. May 1996. <ftp://ftp.rlg.org/pub/archtf/final-report.pdf>.
7. A. Morrison, M. Popham, and K. Wikander. Documentation and Metadata. Creating and Documenting Electronic Texts: A Guide to Good Practice. Oxford: Oxford Text Archive, 1999. <http://ota.ahds.ac.uk/documents/creating/chap6.html>.
8. N. Beagrie, and D. Greenstein. A Strategic Policy Framework for Creating and Preserving Digital Collections. Arts and Humanities Data Service (AHDS). Version 5.0, 14/7/98. Updated July 2001, C. Pressler. (available online after November 28; http://ahds.ac.uk/about/publications/)

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ACH/ALLC / ACH/ICCH / ALLC/EADH - 2004

Hosted at Göteborg University (Gothenburg)

Gothenborg, Sweden

June 11, 2004 - June 16, 2004

105 works by 152 authors indexed

Series: ACH/ICCH (24), ALLC/EADH (31), ACH/ALLC (16)

Organizers: ACH, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None