Applications of the Open Archives Initiative

  1. 1. Martin Halbert

    Emory University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The application of computing solutions is regularly applied to the matter of assisting scholars with their research and analysis of findings. Computing technologies have become increasingly important in libraries and archives as these institutions strive to make the many unique resources they hold more easily accessible to their users. As information institutions begin to exploit digital technologies an often under appreciated distinction is the one between digital content creation and digital library services. Aggressive programs and initiatives to digitize primary source content and archival finding aids have resulted in immeasurable piles of sometimes interesting, sometimes trivial, heterogeneous, unorganized data, leaving the user in a state of “information overload”. Digitization efforts are typically organized at a local level and scholars elsewhere are often unaware that information resources of possible interest have been digitized. What’s been lacking are the infrastructure technologies and protocols that support the implementation and development of digital library services which organize and make more useful the array of diverse and widely distributed collections of digital content and descriptive metadata that have been and are continuing to be created at a rapid pace. This panel session will discuss experiences so far with the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH), an emerging digital library protocol designed to enable the implementation of new and innovative online services and to enhance and facilitate interoperability and the ways in which scholars can discover and access unique information resources wherever they reside. Each participant will provide a brief overview of their involvement with OAI followed with a discussion of the infrastructures and techniques used by various projects to explore the scalability and extensibility of the protocol. OAI-PMH BRIEF BACKGROUND The Open Archives Initiative (OAI) was launched with a meeting of digital librarians and computer scientists specializing in archiving, metadata, and interoperability in October 1999. The shared objective of this group was to pave the way for universal public archiving of the scientific and
scholarly research literature on the Web. What was needed were conventions that would allow any paper deposited in any public archive to be found from anyone’s desktop, worldwide. This was essentially a model for a virtual public library. The group recognized there may be many different archive initiatives likely to emerge from this vision, each presenting the possibility of different conceptual, organizational and technical foundations. The issue of maintaining the archives as “open” would be crucial for these initiatives to successfully become part of the scholarly communication system and as such, technical issues concerning interoperability would be crucial. From these original discussions it was determined that using a harvesting approach to aggregating metadata describing the resources held by archives would be simplest to implement and more scalable as more archives become available. From this emerged the OAI Protocol for Metadata Harvesting (OAI-PMH),, first released in January 2001. Since then it has emerged as a practical foundation for interoperability among digital library efforts. OAI-PMH includes two distinct yet simple components. At one end, data providers use the OAI-PMH to expose metadata in a variety of forms. At the other end, service providers use it to harvest the metadata from the data providers and subsequently process it, adding value in the form of services like domain specific portals to specific user communities. With an interest in exploring the sustainability and extensibility of the protocol, the Andrew W. Mellon Foundation announced the funding of seven service provider (harvesting) projects in July 2001. The Digital Library Federation has generously provided organizational support for these OAI activities. Several of the projects focus on aggregating materials of possible interest to scholars seeking cultural materials. One of these projects is the Cultural Heritage Repository at the University of Illinois, Emory University, home to two of the seven original Mellon-funded projects, also focuses on cultural materials, The UCLA Digital Library Program is host for the Sheet Music Consortium project, These projects provide the OAI-PMH backdrop for this panel session, outlining details of the technology from the service provider perspective and ultimately end-user utility of their portals. DIGITAL LIBRARY FEDERATION The Digital Library Federation (DLF) is a consortium of libraries and related agencies that are pioneering in the use of electronic-information technologies to extend their collections and services. Through its members, the DLF provides leadership for libraries broadly by identifying standards and “best practices” for digital collections and network access, coordinating leading-edge research-and-development in libraries’ use of electronic-information technology, and helping start projects and services that libraries need but cannot develop individually. One such supported project is the Open Archives Initiative. The DLF support takes two forms. Financially the DLF provides some organizational support for OAI. The DLF is also responsible for organizing the Mellon-funded OAI projects that are evaluating the OAI technical framework. EMORY UNIVERSITY Emory University Library, in collaboration with SOLINET/ASERL, has developed two scholarly portal services based on metadata harvested from a variety of archives and digital library projects. The MetaArchive.Org and AmericanSouth.Org portals represent the first national projects to deploy the ARC software developed at Old Dominion University and the Open Digital Library (ODL) software developed at the Virginia Tech Digital Library Research Laboratory. These two portals, operating under the umbrella name of the MetaScholar Initiative http://MetaScholar.Org, are designed to engage scholars, librarians, and archivists in the process of making a large body of metadata useful in scholarly research. Twenty institutions (primarily located in the geographic South) have initially contributed metadata to the central union database underlying both systems. ASERL participants established OAI-PMH data provider systems to dynamically serve out updated metadata from their local digital archives. Many libraries in the MetaArchive project are smaller four-year institutions which will make use of metadata conversion services provided by Emory in order to make their local finding aids and archival databases available for harvesting. Initially, metadata was collected representing primary research materials from a set of subject domains including the culture and history of the American South, papers of major political figures, and religious institutional records. An interdisciplinary team including faculty from many research institutions around the country was tasked with evaluating the American South portal. This Scholarly Design Team was chaired by Dr. Charles Reagan Wilson of the University of Mississippi, editor of the Encyclopedia of Southern Culture. The team was also charged with exploring several new approaches to scholarly communication. SHEET MUSIC CONSORTIUM Johns Hopkins University, Indiana University and the UCLA Digital Library Program have joined forces to create a virtual catalog of sheet music in the United States called the Sheet Music Consortium. This project is using the OAI-PMH to harvest descriptive metadata from local databases and make it accessible through an
OAI searchable repository hosted by the UCLA Digital Library Program. Consortium member institutions have chosen to catalog their sheet music in different ways, but a very large proportion of the original sheets in participating collections have been digitized, allowing users direct access to the music and—in many cases—covers and advertisements that offer evidence of the cultural context in which the songs were published. The first phase of the project aims to establish the service as a gateway to these US collections, and contains over 60,000 records from Indiana University, Johns Hopkins University, UCLA Music Library’s Archive of Popular American Music, and records from the Library of Congress. UNIVERSITY OF ILLINOIS The University of Illinois has developed a vertical, domain-specific digital library portal designed to search metadata describing manuscript archives and digital cultural heritage information resources. Metadata describing non-digital resources and resources of restricted availability is included along with metadata describing publicly available digital objects. The Illinois team has explored “best practices” for using the OAI-PMH to reveal resources contained in hierarchical metadata structures such as those expressed in archival finding aids displayed with Encoded Archival Description (EAD) metadata. Materials in the Illinois’ Cultural Heritage Repository include harvested metadata from the Library of Congress American Memory Heritage Collections and from the Online Archive of California. The American Memory and Online Archive of California resources are made searchable via their metadata alongside other materials gathered from over 39 institutions. Total number of discrete items searchable is over 2 million and represents public and research libraries, museums, archives, historical societies, and digitization projects focusing on cultural heritage materials.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

"Web X: A Decade of the World Wide Web"

Hosted at University of Georgia

Athens, Georgia, United States

May 29, 2003 - June 2, 2003

83 works by 132 authors indexed

Affiliations need to be double-checked.

Conference website:

Series: ACH/ICCH (23), ALLC/EADH (30), ACH/ALLC (15)

Organizers: ACH, ALLC

  • Keywords: None
  • Language: English
  • Topics: None