Implementing the Assertive Edition for Historians – Some samples

paper, specified "short paper"
Authorship
  1. 1. Georg Vogeler

    Karl-Franzens Universität Graz (University of Graz)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.


The “assertive edition”
Historians have since long considered historical documents as an information carrier. Editing documents for historical research thus often meant to create modernized or abridged texts and offered a wide range of tools to access the content of the texts (abstracts, rich indices or commentaries). In the digital world this approach has ended in a type of digital scholarly edition which is still lacking an accepted term: drawing from the Semantic Web it could be named “semantic edition”, in the context of “fact-checking” it might be termed “factual edition”, or putting it into logical reasing the term could be “assertive edition” (Vogeler, 2018). Scholarly editions in this field range from early adopters of OWL like the Henry III Fine Rolls (Ciula et al., 2008; Ciula, Veira, 2007), extensive RDF representations of indices like in sandrart.net (Schreurs et al., 2007-2012) or burkhard-source

http://burckhardtsource.org/

, or infrastructures based on the integration of text and database as in Symogih (Beretta, Vernus, 2012; Beretta, 2015).

http://symogih.org/

Projects like Old-Bailey-Online (Howard, 2015)

https://www.oldbaileyonline.org/

conceptualised the information part as relational databases, recent activities in the context of the Historical Atlas of Poland project created a resource labelled “edition” as database of annotations to images of the original documents (Borek et al., 2018, Słoń, Zachara-Związek, 2018)

https://atlas.ihpan.edu.pl/gis/agad_wschowa_i2_pub/index.php

.

The main change in the recent years in these digital editions is the extension of the concept of database representation from annotations similar to printed indices (Poupeau 2006) to representations of full information conveyed. In particular, the simple representation of information in RDF statements is attractive and supports this effort. RDF additionally has the advantage to be an accepted data exchange standard, which is flexible and expressive enough to cover a wide range of projects. On the other hand, the TEI has been extended to include formal descriptions of persons, places, names and basic knowledge information organisation. Classic relational database systems are powerful tools to create datasets representing complex knowledge, and the recent discussion on graph technologies is moving the discussion about the best suited technical solutions forward. The paper will give an insight in a group of projects trying to integrate the wide range of technologies around.
The DH repository and publication platform GAMS (Stigler / Steiner, 2018) in Graz holds several projects which can be considered “semantic”/”factual”/”assertive” editions: the scholarly editions of the annual accounts of the City of Basel (Burghartz, 2015), the Urfehdebuch Basel (Burghartz et al., 2017; Pollin, Vogeler, 2017), or the Cantus-Network (Klugseder, Praßl, 2018). Others are in preparation (Klug: Tegernseer Wirtschaftsbuch; Klug et al.: Cooking Receipts of the Middle Ages; Wareham: British Hearth Tax Online; Haug-Moritz: Imperial Diet 1576

https://reichstagsakten-1576.uni-graz.at/

). The tasks to be solved with these resources are multiple: Which assertion made by the texts should be represented? How good the XML structure helps to extract assertions? How to link between data representation, text, and image? How to organise an efficient workflow?

Which content should be represented?
All projects come from specific domains: accounting, criminal history, history of liturgy, taxation, political institutions, and dietary history. This leads to the development of individual ontologies. In the case of accounts the projects could benefit from an international activity in the development of an ontology for historical accounting (MEDEA

http://medea.hypotheses.org/

, DEPCHA

http://gams.uni-graz.at/context:depcha

). The criminal records used a “local” ontology which should be harmonised in the wider range of criminal history research. The Cantus Network reused identifiers from liturgical study (cantus database)

http://cantus.uwaterloo.ca/

and started to work on an ontology for liturgical feasts based on the reference work by Hermann Grotefend (1898). Cooking receipts can reuse food ontologies for modern purposes or generic data from wikidata. For the imperial diet the historians in the project suggested to work on an ontology representing forms of communication as they are considered core in the interpretation of the political event. The experiences with these ontologies demonstrate that a core functionality in the implementation of assertive editions it he possibility to model information which goes beyond the flat text, i.e. information that can hardly be extracted with standard information extraction methods.

How to link between data representation, text, and image?
The GAMS system integrates TEI transcriptions with a IIIF image server. This makes it easy to link between text and images. The relationship between assertions and text is handled by a local ontology considering the data entries as a kind of factoid (Bradley, Short, 2005), which carries a link to the source. In practice the textual fragments supporting the formal assertions carry identifiers by which they can be stored together with the data with a local <g2o:textualContent> property.

http://gams.uni-graz.at/o:gams-ontology/#textualContent

This is an approach similar to Web Annotation so it seems suitable to reuse the W3C Web Annotation Vocabulary.

https://www.w3.org/TR/annotation-model/

This would even allow the full range of relationships between the main representations of the text (digital image, transcription, database).

How to organise an efficient workflow?
Centring the technical solution on well-established standards helps to create a framework of different tools. While Oxygen-XML is a main tool for XML encoding and plug-ins like Ediarum

www.bbaw.de/en/telota/software/ediarum

help in transcription and basic markup, tools like Transkribus

https://transkribus.eu/

are used in the Cooking Receipt project to create topographically precise transcriptions. In the Cantus project a workflow based on the conversion of word files to TEI was chosen. The TEI offers with the @ana attribute a powerful method to extract assertions with XSLT. For the criminal records, the accounts and the hearth tax documents, customized TEI are used, which combine the text structure (e.g. in simple <p> tags) with the interpretative level (with @ana as a URI references). XSLT-Transformations during ingest in the GAMS convert extracts the RDF representation of the text. The DEPCHA project demonstrates that this workflow can be used as a hub between various input forms.

Conclusion
The experiences in the projects show that DH provides methods for historians to create rich editions including formal representations of the information conveyed. The GAMS infrastructure does this by combining a set of tools around standards like IIIF, TEI and RDF. Still, the standards for data exchange in many humanities domains do not exist. Future work will have to focus on creating reused domain ontologies, which the data-for-history initiative is promoting.

http://dataforhistory.org/

It has to be evaluated, if W3C Web annotation can become the major link between the various representations of texts in digital scholarly editions. All of this can form the assertive edition many historians dream of.

Bibliography

Beretta, F. (2015). The symogih.org project and TEI. Encoding structured historical data in XML texts. In
Text Encoding Initiative Conference and Members’ Meeting 2015. Connect, Animate, Innovate., Oct 2015, Lyon, France,
https://halshs.archives-ouvertes.fr/halshs-01251915.

Beretta, F., and Vernus, P. (2012). Le projet SyMoGIH et la modélisation de l'information. Une opération scientifique au service de l'histoire. In
Les Carnets du LARHRA 1: 81-107.

Borek, A., Gochna, M., Myrda, G., Słomski, M., Słoń, M., and Związek, T. (2018): Technical and Methodological Foundations of Digital Indexing of Medieval and Early Modern Books of Entries. In
DSH 66: in print.

Bradley, J., and Short, H. (2005). Texts into databases. The Evolving field of New-style Prosopography. In
LLC 20, suppl. 1: 3-24.

Burghartz, S. (ed.) (2015).
Jahrrechnungen der Stadt Basel 1535 bis 1610 – digital. Basel, Graz: ZIM.
http://gams.uni-graz.at/context:srbas.

Burghartz, S., Calvi, S., and Vogeler, G. (eds.) (2017).
Urfehdebücher der Stadt Basel – digitale Edition. Graz: ZIM.
http://gams.uni-graz.at/context:ufbas.

Ciula,
A., and Veira, J. M. (2007). Implementing an RDF/OWL Ontology on Henry the III Fine Rolls, abstract. In
OWLED 2007, Innsbruck, 6-7 June 2007.
http://webont.org/owled/2007/PapersPDF/submission_6.pdf.

Ciula, A., Spence, P., and Veira, J. M. (2008). Expressing complex associations in medieval historical documents. The Henry III Fine Rolls Project. In
LLC 23: 311-325.

Grotefend, H. (1891-1898).
Zeitrechnung des deutschen Mittelalters und der Neuzeit, 2 vols., Hannover: Hahn.

Howard, S. (2015). Bloody Code. Reflecting on a Decade of the Old Bailey Online and the Digital Futures of our Criminal Past. In
Law, Crime and History 5,1:
http://www.pbs.plymouth.ac.uk/solon/journal/vol.5%20issue1%202015/Howard%20Bloody%20Code.pdf.

Pollin, Ch., and Vogeler, G. (2017). Semantically Enriched Historical Data. Drawing on the Example of the Digital Edition of the "Urfehdebucher der Stadt Basel". In Adamou, A., Daga, E., and Isaksen, L. (eds.).
WHiSe 2017. Workshop on Humanities in the Semantic Web. Proceedings of the Second Workshop on Humanities in the Semantic Web (WHiSe II), co-located with 16th International Semantic Web Conference (ISWC 2017),
http://ceur-ws.org/Vol-2014/paper-03.pdf.

Poupeau, G. (2006). De l'index nominum à l'ontologie. Comment mettre en lumière les réseaux sociaux dans les corpus historiques numériques? In
Digital Humanities 2006. The First ADHO International Conference: Conference Abstracts. Université Paris-Sorbonne: 161-164.

Klugseder, R., and Praßl, F. K. (eds.) (2018).
Cantus Network. libri ordinarii
of the Salzburg metropolitan province. Graz/Wien: ZIM.
http://gams.uni-graz.at/context:cantus.

Schreurs, A., Blüm, C., and Wübbena, Th. (eds.) (2007-2012).
Sandrart.net. Eine netzbasierte Forschungsplattform zur Kunst- und Kulturgeschichte des 17. Jahrhunderts.
http://sandrart.net.

Słoń, M., and Zachara-Związek, U. (2018). The Court Records of Wschowa (1495–1526). Digital Edition. In
Studia Geohistorica 6: 204-222.

Stigler, J., and Steiner. E. (2018). GAMS - an Infrastructure for the Long-Term Preservation and Publication of Research Data from the Humanities. In
Mitteilungen der VÖB 71,1: 66-75.
10.31263/voebm.v71i1.1992.

Vogeler, G (2019). The ‘assertive edition’. In
International Journal of Digital Humanities 2, forthcoming, preprint:
https://hcommons.org/deposits/item/hc:21373/.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2019
"Complexities"

Hosted at Utrecht University

Utrecht, Netherlands

July 9, 2019 - July 12, 2019

436 works by 1162 authors indexed

Series: ADHO (14)

Organizers: ADHO