Sustainability?! Four Paradigms for Humanities Data Centers

poster / demo / art installation
Authorship
  1. 1. Patrick Sahle

    Cologne Center for eHumanities - Universität zu Köln (University of Cologne)

  2. 2. Simone Kronenwett

    Data Center for the Humanities (DCH) - Universität zu Köln (University of Cologne)

  3. 3. Jonathan Blumtritt

    Data Center for the Humanities (DCH) - Universität zu Köln (University of Cologne)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Introduction
The Digital Humanities are about creating knowledge. Methods and tools, formats and data structures are designed to generate valuable digital research data and accessible resources. However, these products of scholarship are created within the framework of temporary projects. So what happens when projects come to an end? What about sustainability and future accessibility of results which were produced on cost of public money? One common and obvious answer of our times concerning the questions raised above might be ‘long-term preservation’. The archiving of digital data has been under research for many years now and will lead to the establishment of generic methodological and technical solutions as well as institutionalized data archives. But is this really the full answer to the questions posed? Typically, the goal of research projects in the humanities is not only generating ‘records’, ‘data objects’ or ‘collections’ that follow standardized models and formats and are thus easily stored together and reused by others. Rather, research in the humanities is characterized by individual, local approaches and data models that lead to very specific databases and, more importantly, publications with their own logic of presentation, access and usability.

Data, Resources, and Data Centers
Within the humanities, there is an important difference between ‘data’ and ‘resources’. This distinction has to be taken into account: often data is usable only within its context and is made accessible by specific web presentations. But who will take care of these ‘resources’ in the long run? On the one hand, long-term maintenance for sometimes idiosyncratic ‘living systems’ cannot be ensured by individual scholars - not even by academic research departments. On the other hand, nonspecific libraries and archives - including data archives - cannot be expected to provide knowledge on data models and standards that are specific within the humanities. For this data, as well as for the particular ‘resources’, a dedicated research data management is needed as well as workflows and business models for the perpetuation of the presentational systems. A comprehensive solution to these problems must be built upon institutions that can make a permanent commitment. These institutions could be called ‘data centers for the humanities’ and can be attached to other institutions like libraries, archives, computing centers, academic departments or faculties - or digital humanities centers.

Four Paradigms for Humanities Data Centers
These data centers have to provide special research data management for humanities research projects - ideally right from the beginning - to ensure archivability, accessibility, reusability, maintainability and visibility for a long time. The wide range of their tasks and goals can be described in analogy to four paradigms which we already know from our traditional research and information ecosystem:

1.) Archive paradigm. Research data has to be archived in order to be permanently preserved and secured. As pure ‘data’, as ‘bits and bytes’, research products in the humanities are not very specific and may thus be stored in generic data archives on which a data center may rely as part of the more basic infrastructure. Here, the integrity of the data has to be ensured, data may be converted into different formats, and data objects may be delivered for reuse when needed.

2.) Library paradigm. A library maintains a descriptive catalog of its holdings, cares for unique call numbers and keeps data ready for direct access. In terms of digital infrastructures this means caring for metadata, persistent identifiers and technical interfaces (such as APIs), and the integration of data and metadata into dedicated portals that allow browsing and searching.

3.) Museum paradigm. The common use case of digital output in the humanities is not to harvest or to integrate third party data via APIs. Rather, it is to consult the digital publication that is approachable and readable for humans. Digital research objects have to be presented. Often very special websites and portals are created within research projects and have to be kept alive. As in museums, important holdings are presented in a permanent exhibition. But the exhibition may also be changed, reconfigured and redone over the course of time as well.

4.) Workshop paradigm. Digital libraries evolve to virtual research environments. More and more, the presentation of digital objects and active work on the data coincide. When research platforms pass over from a project to a data center, the work should not have to stop. In an ideal world a data center will also maintain current research environments and keep them alive and working. With either generic or dedicated tools and interfaces, a data center will provide important components for the ongoing editing, enrichment and processing of digital research data.

From Theory to Practice
The issues raised here have to be addressed on a methodological and a theoretical level. But the problem of short term projects that become orphaned and whose data is at risk is acute. Therefore, the University of Cologne founded a Data Center for the Humanities in 2012. This paper will report on the theoretical background, further concepts and plans as well as the first practical steps that have already been taken.

References
Ball, A. (2012). Review of Data Management Lifecycle Models (version 1.0). REDm-MEDProject Document redm1rep120110ab10. Bath/UK: University of Bath. opus.bath.ac.uk/28587.

Blanke, T., M. Hedges(2013). Scholarly Primitives. Building Institutional Infrastructure for Humanities e-Science. In: Future Generation Computer Systems. 29/2. 654-661. linkinghub.elsevier.com/retrieve/pii/S0167739X11001178.

Büttner, S., H.-C. Hobohm, and L. Müller (eds) (2011). Handbuch Forschungsdatenmanagement. Bad Honnef: Bock+Verchen.

Burrows, T. (2011). Sharing Humanities Data for e-Research. Conceptual and Technical Issues. In: Sustainable data from digital research. Humanities perspectives on digital scholarship. Proceedings of the conference held at the University of Melbourne. 12-14th December 2011. Sydney: Custom Book Centre. ses.library.usyd.edu.au/handle/2123/7938.

Data Center for the Humanities (DCH). www.dch.uni-koeln.de.

Hügi, J., R. Schneider (2013). Digitale Forschungsinfrastrukturen in den Geistes- und Geschichtswissenschaften. Genf: Haute école de gestion de Genève.

Molloy, L. (2011). Oh, the Humanities! A Discussion About Research Data Management for the Arts and Humanities Disciplines. In: JISC MRD - Evidence Gathering 2011. mrdevidence.jiscinvolve.org/wp/2011/12/16/oh-the-humanities-a-discussion-about-research-data-management-for-the-arts-and-humanities-disciplines/.

Neuroth, H., S. Strathmann, and A. Oßwald et al (eds) (2012). Langzeitarchivierung von Forschungsdaten. Eine Bestandsaufnahme. Boizenburg: Werner Hülsbusch.

Sahle, P., S. Kronenwett (2013). Jenseits der Daten: Überlegungen zu Datenzentren für die Geisteswissenschaften am Beispiel des Kölner ‘Data Center for the Humanities’ In: LIBREAS. Library Ideas. 23. http://libreas.eu/ausgabe23/09sahle/.

Van den Eynden, V., L. Corti, and M. Woollard et al (2011). Managing and Sharing Data 2011: Best Practice for Researchers. Colchester: UK Data Archive. www.data-archive.ac.uk/media/2894/managingsharing.pdf.

M. Thaller et al (2011). DA-NRW: a distributed architecture for long-term preservation. Proceedings of the 1st International Workshop on Semantic Digital Archives. www.danrw.de/wp-content/uploads/paper13.pdf.

UK Data Archive. www.data-archive.ac.uk.

See discussion on Digital Medievalist Mailing List. Subject How to make your data live forever (and maybe your project). 21-06-2013. www.digitalmedievalist.org/index.html.

There are different data management lifecycle models. An overview is given in A. Ball (2012). Review of Data Management Lifecycle Models (version 1.0). REDm-MEDProject Document redm1rep120110ab10. Bath/UK: University of Bath. opus.bath.ac.uk/28587/.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO