Digital Humanities Data Curation Institutes: Challenges and Preliminary Findings

poster / demo / art installation
  1. 1. Megan Senseney

    University of Illinois, Urbana-Champaign

  2. 2. Trevor Muñoz

    University of Maryland, College Park

  3. 3. Julia Flanders

    Northeastern University

  4. 4. Ali Fenlon

    University of Illinois, Urbana-Champaign

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

1. Introduction

The growth of digital humanities research has made the curation of DH research data a priority for humanities scholars and their institutions. Data curation “addresses the challenges of maintaining digital information produced in the course of research in a manner that preserves its meaning and usefulness as potential input for future research” 1. More fully integrating data curation into digital research involves fluency with topics such as publication and information sharing practices, descriptive standards, metadata formats, and the technical characteristics of digital data. This poster presents lessons learned from a series of workshops on digital humanities data curation conducted in June and October 2013 with funding from the National Endowment for the Humanities Institutes for Advanced Topics in the Digital Humanities program.
2. Project Background

The Digital Humanities Data Curation (DHDC) institute series is a collaborative initiative led by the Maryland Institute for Technology in the Humanities (MITH) in cooperation with the Women Writers Project (WWP) at Northeastern University and the Center for Informatics Research in Science and Scholarship (CIRSS) at the University of Illinois’ Graduate School of Library and Information Science (GSLIS). Institutes are designed to offer humanities scholars with all levels of expertise a grounding in data curation practices and problems.
3. Workshop Participants and Community Response

Two of three institutes have been conducted, first in June 2013 at GSLIS and again in October 2013 at MITH. The team received 111 and 136 applications for the first and second institutes, which were designed to accommodate 20 participants each; acceptance rates ranged from 15% to 18%. For an overview of participant demographics across institutes, see Fig. 1.

Fig. 1: Participant demographics from DHDC Institutes
4. Participant Feedback and Lessons Learned

To make the curriculum responsive to participant needs, the project team conducted evaluation surveys following each workshop and analyzed materials generated from workshop activities, note-taking, social media interactions, and direct feedback. Responses thus far have expressed enthusiasm for instructors and the workshop’s overall framework, while also recommending the addition of more hands-on activities and a greater focus on tools, metadata, and infrastructure issues. Participants have consistently ranked the following topics as very valuable for digital humanists: conceptual frameworks for data curation, understanding the nature of digital objects, types of metadata, collections as curation, and data and the law. To address participant feedback following the first institute, the project team revised the initial curriculum by increasing the number of group exercises, adding two lectures on metadata, and including a hands-on session on item deposit and retrieval in Islandora. The schedule for each institute is available at
5. Significance of Digital Humanities for Data Curation

One key outcome of the institutes thus far has been a set of emerging insights into the special challenges of data curation in a digital humanities context, which was an important goal of the original proposal. These insights have been particularly evident in several key areas of the discussion. First, each institute has featured a discussion of roles that has revealed the diversity of job descriptions and professional identities of data curators in digital humanities. Second, the featured case studies have demonstrated the challenging nature of digital humanities data with respect to format, anticipated future usage, and methodological texture requiring capture and documentation. And finally, participant questions have shown a need for resources to guide data curators in working with specifically digital humanities data: for example, crowd-sourced transcriptions, standoff annotation, and data in experimental formats.
6. Conclusion

The overwhelming response to DHDC’s calls for applications indicates the need to sustainably conduct data curation training for digital humanists at a larger scale. While, slide sets, notes, and resources lists are currently available online through the project's GitHub wiki (, project materials will soon be revised for broader impact and integrated into the affiliated DHCuration Guide (, a community resource that extends beyond the project in promoting a community of scholars focused on discipline-specific curation practices and skills.

1. Muñoz, T., & Renear, A.H. (2011). Issues in humanities data curation. Discussion paper circulated at the Palo Alto Summit on Humanities Data Curation, Stanford, CA, June 23, 2011. Available at and

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from (needs to replace plaintext)

Conference website:

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO