Center for Digital Scholarship - Brown University
Library and Information Services - Wheaton College
Humanities and Arts Network of Technological Initiatives - University of Virginia
Center for Digital Scholarship - Brown University
Building a TEI Archiving, Publishing, and Access Service: The TAPAS Project
Print Friendly
XML
Flanders, Julia, Center for Digital Scholarship, Brown University, USA, Julia_Flanders@brown.edu
Hamlin, Scott, Library and Information Services, Wheaton College, USA, hamlin_scott@wheatoncollege.edu
Alvarado, Rafael, Humanities and Arts Network of Technological Initiatives, University of Virginia, USA, ontoligent@gmail.com
Mylonas, Elli, Center for Digital Scholarship, Brown University, USA, Elli_Mylonas@brown.edu
As the language of DH 2011’s ‘big tent’ suggests, in recent years the profile of digital humanities work has expanded to include many scholars and practitioners who draw on a multitude of digital technologies in research and educational contexts, without assuming all the roles required to implement these technologies. This new generation (though some are ‘new’ only to digital matters) of digital humanists are undertaking intellectually ambitious work with digital methods and tools, but their interest does not necessarily arise from a strong institutional history or infrastructure, or from personal expertise with digital methods. Rather, they are practicing scholars who are increasingly aware of the shifting stakes of technology for the humanities, and who want to explore what may be possible by working in a new way. As a result, their ambitions often outstrip what their own institutions can support: the available infrastructure of digital publishing, archiving, data curation, and repository services may be limited or absent. An individual scholar can gain expertise and achieve interesting results using the TEI Guidelines (http://www.tei-c.org) or GIS, but it is a slower and more challenging process for a university to develop the institutional infrastructure to support that expertise, in the way that traditional libraries (for instance) support traditional forms of humanities scholarship.
The TEI Archiving, Publication, and Access Service (TAPAS, http://www.tapasproject.org) is aimed at addressing this gap, by providing repository and publication services for small TEI projects. TAPAS began with a planning grant from the IMLS (TAPAS 2010), originally proposed by a group of small liberal-arts institutions including Wheaton College, Willamette University, Hamilton College, Vassar College, Mount Holyoke College, and the University of Puget Sound, and later joined by Brown University and the University of Virginia. This planning group conducted an intensive study of the profile of needs, and developed a specification for the TAPAS service. TAPAS is now operating under a two-year IMLS National Leadership Grant to Wheaton College and Brown University which funds the development of the service. TAPAS has also received an NEH Digital Humanities Startup Grant, led by Wheaton College and the University of Virginia, which funds the development of the user interface. Hosted at Brown University, the TAPAS service will provide repository storage, data curation, and simple interfaces for data management and publication. It will also provide an API through which the TEI data can be accessed and remixed. The service thus aims to fill a crucial niche, enabling both a new type of publication and a new model for how scholarly publication is supported. All of these needs are particularly urgent in the liberal-arts community that is the central focus of TAPAS, but they are also strongly evident in the humanities academy more broadly, at a national and international level.
This project takes place within a landscape already well populated with large-scale infrastructural projects (Hedges 2009), such as TextGrid (http://www.textgrid.de/), DARIAH (http://www.dariah.eu/), CLARIN (http://www.clarin.eu/external/), and the Canadian Writers Research Collaboratory (CWRC, http://www.cwrc.ca/). Projects of this kind must all confront a central set of strategic concerns and design challenges, including questions about how much uniformity to impose upon the data, how to accommodate variation, how to create interoperability layers and tools that can operate meaningfully across multiple data sets (DARIAH 2011a, DARIAH 2011b), and how to manage issues of sustainability (of both the data and the service itself). TAPAS is distinctive within this landscape because of its focus on a single form of data (TEI-encoded research materials) and also because of its initial emphasis on serving an underserved constituency (scholars at smaller or under-resourced institutions) rather than on providing an infrastructure that can operate comprehensively. TAPAS is thus able to tackle the questions above in a highly focused way.
The proposed poster will focus on several key areas of the TAPAS project that will be the focus of our attention in the early phases of the project:
Architecture and system design. The TAPAS service is built on a Fedora repository, and the user interaction will be managed as a set of modular layers using tools like Drupal. The design of these layers needs to take into account information about how scholars need to interact with the service for activities such as:
creating new project records
uploading new data files, uploading revised versions of existing data files
creating metadata for data files, updating the metadata for existing files
configuring options for dissemination, publication, and other modes of access and discovery, such as interface choices, stylesheets, and information to be exposed via APIs.
The poster will provide a detailed look at the internal architecture and the ways that standards like RDF and METS are used to organize information and enable flexible deployment of repository data.
TEI schema development and the challenge of eclecticism. TAPAS plans to accept a broad range of TEI data, but will also need to identify different classes of data that share specific properties, such as genre or the presence of certain encoding features, to determine what kinds of interface tools will or will not be appropriate for a given data set. TAPAS will also use various forms of validation to help TAPAS contributors ensure the consistency and quality of their data prior to upload. The design and use of schemas used within the TAPAS ecology – extending from training, through data creation and management, to long-term data curation is complex and will be an important focus of the project’s research. The poster will provide a detailed view of the different roles that schemas of various kinds will play in this ecology, and the principles guiding their design and use.
Designing a hosted service. Although TAPAS was prompted by the needs of individual scholars, its implementation as a hosted service means that it also plays an important aggregative role. The TAPAS collection of TEI data has the potential not only to serve as an important corpus of TEI data (of value, for instance, to those interested in the historiography of digital humanities, or in studying how the TEI is used) but also to provide important inter-project connections that may benefit the individual TAPAS contributors and their readers. In addition, designing TAPAS as a hosted service raises a number of issues concerning long-term data curation, rights, and the fiscal sustainability of the service itself. The poster will examine these issues with a particular focus on:
the membership and sustainability model for the service
the handling of intellectual property rights
the design of cross-project tools for searching, exploration, and visualization
User interface. Because TAPAS is intended to support scholars who – although they may be expert users of TEI – are not necessarily experts in working with repositories and XML publication, the design of the user interface will be critical in making the TAPAS service approachable. In addition, because some users may be managing very large numbers of files, the user interface will need to provide productive, intuitive ways of visualizing one’s data from a management standpoint as well as a publication standpoint.
funding
This project is made possible by a grant from the U.S. Institute of Museum and Library Services.
references
DARIAH (2011a). Technical Work: Conceptual Modelling. DARIAH Work Package 8. http://www.dariah.eu/index.php?option=com_content&view=article&id=31&Itemid=35.
DARIAH (2011b). Technical work: Technical reference architecture. DARIAH Work Package 7. http://www.dariah.eu/index.php?option=com_content&view=article&id=30&Itemid=34.
TAPAS (2010). Roadmap. http://www.tapasproject.org/roadmap
TextGrid (2010). Roadmap Integration Grid/Repository. TextGrid, September 2010. http://www.textgrid.de/fileadmin/TextGrid/reports/TextGrid_R121_v1.0.pdf.
Hedges, M. (2009). Grid-enabling Humanities Datasets. Digital Humanities Quarterly 3(2). http://www.digitalhumanities.org/dhq/vol/3/4/000078/000078.html
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Complete
Hosted at Universität Hamburg (University of Hamburg)
Hamburg, Germany
July 16, 2012 - July 22, 2012
196 works by 477 authors indexed
Conference website: http://www.dh2012.uni-hamburg.de/
Series: ADHO (7)
Organizers: ADHO