Document-Centric Framework for Navigating Texts Online, or, the Intersection of the Text Encoding Initiative and the Metadata Encoding and Transmission Standard

paper
Authorship
  1. 1. John A. Walsh

    Indiana University, Bloomington

  2. 2. Michelle Dalmau

    Indiana University, Bloomington

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Electronic text projects range in complexity from simple
collections of page images to bundles of page images and
transcribed text from multiple versions or editions (Haarhoff,
Porter). To facilitate navigation within texts and across texts,
an increasing number of digital initiatives, such as the Princeton
University Library Digital Collections (http://diglib.princeton.
edu/) and the Oxford Digital Library (http://www.odl.ox.ac.
uk/), and cultural heritage organizations, such as the Culturenet
Cymru (http://www.culturenetcymru.com) and the Library
of Congress (http://www.loc.gov/index.html), are relying on
the complementary strengths of open standards such as the
Text Encoding Initiative (TEI) and the Metadata Encoding and
Transmission Standard (METS) for describing, representing,
and delivering texts online.
The core purpose of METS is to record, in a machine-readable
form, the metadata and structure of a digital object, including
fi les that comprise or are related to the object itself. As such,
the standard is a useful tool for managing and preserving
digital library and humanities resources (Cantara; Semple).
According to Morgan Cundiff, Senior Standards Specialist for
the Library of Congress, the maintenance organization for the
METS standard:
METS is an XML schema designed for the purpose of
creating XML document instances that express the
hierarchical structure of digital library objects, the names
and locations of the fi les that comprise those digital objects,
and the associated descriptive and administrative metadata.
(53)
Similarly, the TEI standard was developed to capture both
semantic (e.g., metadata) and syntactic (e.g., structural
characteristics) features of a document in machine-readable
form that promotes interoperability, searchability and textual
analysis. According to the Text Encoding Initiative web site, the
TEI standard is defi ned thusly: The TEI Guidelines for Electronic Text Encoding and Interchange
defi ne and document a markup language for representing
the structural, renditional, and conceptual features of texts.
They focus (though not exclusively) on the encoding of
documents in the humanities and social sciences, and in
particular on the representation of primary source materials
for research and analysis. These guidelines are expressed
as a modular, extensible XML schema, accompanied by
detailed documentation, and are published under an opensource
license. (“TEI Guidelines”)
METS, as its name suggests, is focused more exclusively on
metadata. While digital objects—such as a text, image, or
video—may be embedded within a METS document, METS
does not provide guidelines, elements and attributes for
representing the digital object itself; rather, the aim of METS is
to describe metadata about a digital object and the relationships
among an object’s constituent parts. In sum, METS is datacentric;
TEI is document-centric.
In April 2006, the Indiana University Digital Library
Program released a beta version of METS Navigator (http://
metsnavigator.sourceforge.net/), a METS-based, open source
software solution for the discovery and display of multi-part
digital objects. Using the information in the METS structural
map elements, METS Navigator builds a hierarchical menu that
allows users to navigate to specifi c sections of a document,
such as title page, specifi c chapters, illustrations, etc. for a
book. METS Navigator also allows simple navigation to the
next, previous, fi rst, and last page image or component parts
of a digital object. METS Navigator can also make use of the
descriptive metadata in the METS document to populate the
interface with bibliographic and descriptive information about
the digital object. METS Navigator was initially developed at
Indiana University (IU) for the online display and navigation
of brittle books digitized by the IU Libraries’ E. Lingle Craig
Preservation Laboratory. However, realizing the need for
such a tool across a wide range of digital library projects and
applications, we designed the system to be generalizable and
confi gurable. To assist with the use of METS Navigator for new
projects, a METS profi le, also expressed in XML, is registered
with the Library of Congress (http://www.loc.gov/standards/
mets/profi les/00000014.html). The profi le provides detailed
documentation about the structure of the METS documents
required by the METS Navigator application.
Cantara reported in her brief article, “The Text-Encoding
Initiative: Part 2,” that discussion of the relationship between
the TEI and METS was a primary focus during the Fourth
Annual TEI Consortium Members’ Meeting held in 2004 at
Johns Hopkins University (110). The relationship between
the standards was also a topic of discussion during the
2007 Members’ Meeting, and was specifi cally raised in Fotis
Jannidis’ plenary entitled “TEI in a Crystal Ball.” Despite
these discussions, the community is still lacking a welldocumented
workfl ow for the derivation of METS documents
from authoritative TEI fi les. We believe the TEI, if properly
structured, can be used as the “master” source of information
from which a full METS document can be automatically
generated, facilitating the display of text collections in METS
Navigator. The TEI, much like METS, provides a rich vocabulary
and framework for encoding descriptive and structural
metadata for a variety of documents. The descriptive metadata
typically found in the TEI Header may be used to populate the
corresponding components (descriptive metadata section) of
a METS document. The embedded structural metadata that
describe divisions, sections, headings and page breaks in a
TEI document may be used to generate the structural map
section of a METS document. Unlike the TEI, the METS scheme
has explicit mechanisms in place for expressing relationships
between multiple representations of the digital content such
as encoded text fi les and page images. By integrating the TEIMETS
workfl ow into METS Navigator, scholars and digital
library and humanities programs can more easily implement
online text collections. Further, the intersection between TEI
and METS documents can provide the foundation for enhanced
end-user exploration of electronic texts.
The IU Digital Library Program as a result of enhancing the
functionality and modularity of the METS Navigator software
is also in the process of formalizing and integrating the TEIMETS
workfl ow in support of online page turning. Our paper
will trace the development of METS Navigator including the
TEI-METS workfl ow, demonstrate the METS Navigator system
using TEI-cum-METS documents, review METS Navigator
confi guration options, detail fi ndings from recent user studies,
and outline plans for current and future development of the
METS Navigator. References
Cantara, Linda. “Long-term Preservation of Digital
Humanities Scholarship.” OCLC Systems & Services 22.1
(2006): 38-42.
Cantara, Linda. “The Text-encoding Initiative: Part 2.” OCLC
Systems & Services 21.2 (2005): 110-113.
Cundiff, Morgan V. “An introduction to the Metadata Encoding
and Transmission Standard (METS).” Library Hi Tech 22.1
(2004): 52-64.
Haarhoff, Leith. “Books from the Past: E-books Project at
Culturenet Cymru.” Program: Electronic Library and Information
Systems 39.1 (2004): 50-61.
Porter, Dorothy. “Using METS to Document Metadata
for Image-Based Electronic Editions.” Joint International
Conference of the Association for Computers and the
Humanities and the Association for Literary and Linguistic
Computing, June 2004. Göteborg: Centre for Humanities
Computing (2004): 102-104. Abstract. 18 November 2007
<http://www.hum.gu.se/allcach2004/AP/html/prop120.html>.
Semple, Najla. “Develpoing a Digital Preservation Strategy at
Edinburgh University Library.” VINE: The Journal of Information
and Knowledge Management Systems 34.1 (2004): 33-37.
“Standards, METS.” The Library of Congress. 18 November
2007 <http://www.loc.gov/standards/mets/>.
“TEI Guidelines.” The Text Encoding Initiative. 18 November
2007
< http://www.tei-c.org/Guidelines/>.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2008

Hosted at University of Oulu

Oulu, Finland

June 25, 2008 - June 29, 2008

135 works by 231 authors indexed

Conference website: http://www.ekl.oulu.fi/dh2008/

Series: ADHO (3)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None