Tracing the Complex History of Manuscripts Using Linked Data

workshop / tutorial
  1. 1. Toby Nicolas Burrows

    Oxford University

  2. 2. Kevin Page

    Oxford University

  3. 3. David Lewis

    Oxford University

  4. 4. Emma Cawlfield

    Indiana University of Pennsylvania

  5. 5. Laura Cleaver

    Trinity College Dublin

  6. 6. Jouni Tuominen

    University of Helsinki

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Medieval and Renaissance manuscripts, like other cultural heritage objects, have lengthy and complex histories of production and ownership, involving events as varied as sales, gifts, and even outright theft. Research into these histories can reveal a great deal about the changing value placed on such objects over the centuries, not just in a monetary sense but also in terms of their cultural significance and meaning. Manuscript studies has been a very active interdisciplinary research area in recent decades, and researchers and curators have been intensely involved in the development of digital resources.
This workshop will focus on methods for deploying Linked Data methodologies to aggregate complex data relating to the history and provenance of manuscripts, and to address large-scale research questions in this field through analysis and visualization. The presenters of the workshop have extensive expertise across manuscript provenance research, database development, and Linked Data technologies, and include members of a major international project in this field: “Mapping Manuscript Migrations” (MMM), one of the “Digging into Data” projects funded in 2017.
The workshop will be of considerable interest to the DH community for its focus on Linked Data and on practical methods for transforming, aggregating and reconciling complex existing digital resources to a common data model, whilst preserving the integrity of the data sources. It will also be of particular value for DH researchers, curators, and librarians interested in new approaches to documenting, analysing and visualizing the histories of cultural heritage objects, especially manuscripts.
The workshop will begin by reviewing the field of manuscript history and provenance, and identifying research questions which are representative of the types of research being carried out. It will also examine existing digital resources for this field, including the Schoenberg Database of Manuscripts (University of Pennsylvania), Bibale and Medium (Institut de recherche et d’histoire des textes), and library databases like the Bodleian Library’s new manuscript catalogue. This introductory session will conclude with an overview of the place of Linked Data in the humanities, including the selection and use of ontologies for data modelling.
The middle session of the workshop will focus on the practicalities of mapping a range of different data structures into compatible RDF, including pipelines for transforming standard relational database models (like the Schoenberg Database of Manuscripts) and TEI-XML documents (like the Bodleian Library catalogue). The content will include discussion of the levels of complexity involved, the types of difficulties likely to be encountered, and suggested approaches to problem resolution. Participants will have the opportunity to carry out an introductory hands-on exercise in data analysis for transformation. The session will include an evaluation of suitable reconciliation processes used for matching vocabularies for persons and places across a variety of different source datasets.
The final session of the workshop will provide participants with a hands-on opportunity to work with aggregated manuscript provenance data, through the user interface developed for the MMM project. This will include browsing and searching the aggregated data, and creating various different visualizations based on the data. Participants will be asked to test a variety of research questions against the data and to provide structured feedback on the functionality and usability of the interface and on the data structures.
The workshop will end with an invited reflection and response from an eminent manuscript researcher coordinating a major new international investigation.
Detailed programme of the workshop
• Overview:

Manuscript history and provenance: research questions, existing digital resources and projects
Linked Data in the humanities: why and how, RDF
Selection of ontologies: CIDOC-CRM, FRBR

• Data mapping and transformation:

Transformation pipeline: relational database to RDF [SDBM and Bibale]
Transformation pipeline: bespoke XML to RDF [Medium]
Transformation pipeline: TEI documents to RDF [Bodleian]
Hands-on exercise - data analysis
Reconciliation processes: vocabularies, identifiers

• Aggregated data in action: hands-on exercises

Browsing and searching the data
Visualizing the data
Testing – research questions
Feedback on the interface and the data structures

• Response and summing-up
Workshop leaders
Dr Toby Burrows is a Senior Researcher in the Oxford e-Research Centre at the University of Oxford who is coordinating a major international Digging into Data project (2017-19) focused on manuscript provenance data. He is a medieval studies researcher and librarian with considerable experience in designing and managing digital humanities projects. His current research interests and areas of expertise include: digital technology and the humanities (particularly the relevance of e-research, Linked Open Data and ontologies); medieval manuscripts (particularly their curation, description and digitization); the history of cultural heritage collecting, especially in the 19th and 20th centuries. Email:
Emma Cawlfield Thomson, University of Pennsylvania, is the Project Coordinator for the Schoenberg Database of Manuscripts. She has worked on the Schoenberg Database for over seven years, in roles evolving from data entry and quality control to instructional design and user support. In her training as a librarian, she specialized in digital librarianship and information architecture. Her current research interests revolve around the linked data environment and its application in authority metadata management. Email:
Dr Laura Cleaver, School of Histories and Humanities, Trinity College Dublin, is an expert in the art and architecture in the High Middle Ages, concentrating particularly on medieval manuscripts. Her current interests include the illumination of histories, medieval diagrams, and the trade in medieval manuscripts in the early twentieth century. She is the recipient of a European Research Council Consolidator Grant, beginning in May 2019, for a major project to analyse the collecting of medieval manuscripts in the early twentieth century. Email:
David Lewis is Research Associate, Oxford e-Research Centre, University of Oxford. Having trained as a musicologist, he focuses his research on ways that computers and computational approaches can help musicology and other musical activities. He has worked on online resources for instrumental music, music theory and work catalogues. His current research at the Centre explores uses of Linked Data to support and extend the exploration and sharing of musical information and research. Email:
Dr Kevin Page, University of Oxford, is a senior researcher and associate member of faculty at the University of Oxford e-Research Centre, where he applies Linked Data to the Digital Humanities through several research projects including 'Unlocking Musicology', 'Digital Delius', 'Mapping Manuscript Migrations', and 'Workset Creation for Scholarly Analysis'. As Technical Director of Oxford Linked Open Data (OXLOD) he works with collections across the Gardens, Libraries, and Museums of the University, and has participated in standards activities including the W3C Linked Data Platform (LDP) working group and Linked.Art editorial board. Email:
Dr Jouni Tuominen, University of Helsinki, is a coordinating researcher at Helsinki Centre for Digital Humanities (HELDIG), and a researcher at Department of Computer Science, Aalto University. His research interests include ontology repositories and services, linked data publishing methods, ontology models for legacy data, and tooling for digital humanities. He has collaborated with museums, libraries, and archives on their collection cataloging and cultural heritage data publishing processes. Email:

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.