Migrating Charters into the TEI P5

poster / demo / art installation
  1. 1. Sean M. Winslow

    Zentrum für Informationsmodellierung (ZIM) (Center for Information Modelling) - Karl-Franzens Universität Graz (University of Graz)

  2. 2. Martina Bürgermeister

    Zentrum für Informationsmodellierung (ZIM) (Center for Information Modelling) - Karl-Franzens Universität Graz (University of Graz)

  3. 3. Georg Vogeler

    Zentrum für Informationsmodellierung (ZIM) (Center for Information Modelling) - Karl-Franzens Universität Graz (University of Graz)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Research projects often create domain-specific annotation vocabularies. This poster will present approaches to the modelling and migration of encoded charter data that arose during the migration of the Charters Encoding Initiative (CEI: www.cei.lmu.de) to be compliant with the current version of the Text Encoding Initiative (TEI P5: www.tei-c.org/ ). It is part of a project to migrate and enhance encoded charter descriptions from the virtual charter platform monasterium.net in order to provide a well documented, reusable environment that prolongs the data life cycle (cf. Buddenbohm et al. 2016; Wissik, Ďurčo 2015; Büttner et al. 2011; Flanders, Muñoz 2015). The migration follows established principles of data sustainability and interoperability (cf. Cremer 2015 et al.). It relies on the ingest of data from Monasterium.net into the GAMS (Geisteswissenschaftliches Asset Management System), a Fedora-Commons-based certified trusted digital repository at the University of Graz that handles preservation and publication, and also provides benefits like data visibility, unique handle references (handle.net), and the provision of interoperable data via OAI-PMH (Stigler / Steiner 2018).
To achieve this, a new data model extension had to be developed in order to support both scholarly needs and the careful curation of data. The project evaluated which concepts from the charter domain are of wider importance. The new TEI P5 extension for charter-specific data, based upon the existing CEI, has to structure the data in a context-neutral manner that supports encoding diverse historical periods and regions using diplomatic TEI markup (Vogeler 2018), including Ethiopian royal acts (Wion 2018), Nepalese charters from Mustang (Ramble 2018), and early modern grants of arms from (e.g.) Marburg ( 1581-12-14_Marburg ). This justifies a data model extension in order to support both scholarly needs and the careful curation of data. It includes elements new to the TEI to model:

Authenticating features like signatures, fingerprints, and notarial signa.
Span-level features describing the conventional elements of documentary instruments.
Person/Organization level legal actors.
Status of documents as originals or copies and their relationship with other textual witnesses.

Domain-specific annotation can be achieved additionally through the creation of structured ontologies, e.g. describing methods of authentication, types of manuscript additions, and heraldic blazonry to support the typing and subtyping of data. This enhances the possibility of semantic use in the principle of Linked Open Data (LoD).
Faced with heterogeneous data from a variety of sources (direct-entry from archives, web scraping, digitization of catalogues, and carefully hand-crafted born-digital description), the project involves a series of transformations where charter encoding is re-imagined and the CEI is re-modelled and transformed to the TEI P5 in a context-sensitive manner (see Ambrosio et al. 2014). This entails:

Cleaning and rationalizing existing data in a more interoperable and cross-project comparable manner.
Mapping of existing data to new standards and transformation into current, standards-compliant TEI P5 structure.

The new data model will be tested through the development of facet-based search and predefined queries and visualization based upon scholarly needs of target audience of diplomatics, legal history, and art history scholars. All of this is part of a long-term project to develop tools that enable end users such as archives and individual scholars, as well as the repository monasterium.net, to describe historic legal data in a structured manner that is semantically interoperable with other historical data.


Ambrosio, Antonella; Sébastien Barret; Georg Vogeler (eds.) (2014). Digital diplomatics The computer as a tool for the diplomatist? Beihefte zum Archiv für Diplomatik, Schriftgeschichte, Siegel- und Wappenkunde 14. Böhlau.

Buddenbohm, Stefan; Claudia Engelhardt; Ulrike Wuttke (2016). Angebotsgenese für ein geisteswissenschaftliches Forschungsdatenzentrum. In:
Zeitschrift für digitale Geisteswissenschaften. 2016. http://zfdg.de/2016_003 DOI: 10.17175/2016_003

Büttner, Stephan; Hans-Christoph Hobohm; Lars Müller (ed.) (2011).
Handbuch Forschungsdatenmanagement, Bad Honnef: Bock + Herchen, 2011. www.forschungsdatenmanagement.de

Cremer, Fabian; Claudia Engelhardt und Heike Neuroth (2015). Embedded Data Manager – Integriertes Forschungsdatenmanagement: Praxis, Perspektiven und Potentiale. In: BIBLIOTHEK - Forschung und Praxis 2015; 39(1), 13-31. DOI: 10.1515/bfp-2015-0006

Digital Curation Centre, University of Edinburgh: “ The DCC Curation Model .”

Flanders, Julia; Trevor Muñoz (2015). An Introduction An Introduction to Humanities Data Curation. https://guide.dhcuration.org/contents/intro/

Renear, Allen H. (2004). Text encoding. In:
A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, John Unsworth. Oxford: Blackwell, 2004. http://www.digitalhumanities.org/companion/

Smith, Abby (2004). Preservation. In:
A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, John Unsworth. Oxford: Blackwell, 2004. http://www.digitalhumanities.org/companion/.

Stigler, Johannes, Elisabeth Steiner (2018). GAMS – Eine Infrastruktur zur Langzeitarchivierung und Publikation geisteswissenschaftlicher Forschungsdaten. In: Mitteilungen der VÖB 71,1 (2018), 207-206. DOI: 10.31263/voebm.v71i1.1992

Vogeler, Georg (2018). Digital Diplomatics: The Evolution of a European Tradition or a Generic Concept?. In: Cubelic, Simon , Michaels, Axel und Zotter, Astrid (eds.):
Studies in Historical Documents from Nepal and India, Heidelberg: Heidelberg University Publishing, 2018. 85-109. DOI: 10.17885/heiup.331.454

Wion, Anaïs (2018). “The TEI-XML Architecture of Ethiopian Manuscript Archives: Respecting the Integrity of Primary Sources and Asserting Editorial Choices”, in: Comparative Oriental Manuscript Studies Bulletin 4/1, 2018, 33-38. https://www.aai.uni-hamburg.de/en/comst/pdf/bulletin4-1/33-38.pdf

Wissik Tanja, Matej Ďurčo (2015). Research Data Workflows: From Research Data Lifecycle Models to Institutional Solutions. CLARIN 2015 Selected Papers. Linköping Electronic Conference Proceedings, No. 123. 94-107. http://www.ep.liu.se/ecp/123/008/ecp15123008.pdf

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.