Towards Paleographic Linked Open Data (PLOD): A general vocabulary to describe paleographic features

poster / demo / art installation
Authorship
  1. 1. Timo Homburg

    Fachhochschule Mainz (Mainz University of Applied Sciences)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

IntroductionDigital paleography has been emerging as a field of research since the beginningof the new century. Paleographers, describe how a text has been displayed, andcollect information such as writing styles, contextual information and the epoch.Ciula (2017) gives a good summary about digital paleography and its practices.Challenges concerning digital paleography are discussed by Hassner et al. (2015)for writing systems and discussed more broadly in Stokes (2015). The mainresults of these publications is, that the community of digital paleographers,which was mainly focused on a few particular languages, sees to broaden itsscope and to research more global models to represent paleographic featuressuch as writing style, typographic features and anomalies in texts. This calls fora unified representation of paleographic features to which this publication wouldlike to contribute by suggesting a core vocabulary for paleographic descriptionof texts.Related WorkPaleographic features can to a certain extent be represented in TEI/XML1 Wit-tern et al. (2009) (scribal shifts, scribal alterations) as highlighting annotations.For systematic non-alphabetic languages, encodings can be developed to repre-sent the shape of the characters, e.g. for cuneiform and egyptian hieroglyphicsHomburg (2019), van den Berg (1997)In the linguistic community, linguistic linked open data (LLOD)McCrae et al.(2016) is very present and allows for tools such as BabelNet Navigli & Ponzetto(2012), a multilingual semantic network to translate words and texts based onsemantic content. This publication suggests to create such a semantic networkfor paleographic descriptions in order to formalize this part of research. Natu-rally, such a task cannot be done by a single scholar from a single field, so thepublication begins by suggesting vocabulary contents documenting the structureof systematic scripts like cuneiform and to model relations between componentsof parts of scripts, the so-called core vocabulary. Especially structured scripts(cf. fig. 1) expose a variety of encoding schemes for e.g. Chinese Bishop & Cook(2003), Japanese Apel (2014). Those encodings have been created to modelfonts, but form an ideal basis to create the proposed unified vocabulary. Similarto the OliA ontologies Chiarcos & Sukhareva (2015) the author suggests to ex-tend this core vocabulary representing the structure of the script (figure 3) bylanguage/script specific extensions. The concept will be presented using the ex-ample of cuneiform and egyptian hieroglyphics and builds upon the vocabularyshown in figure 2. Figure 1. Example of the PaleoCodage encodingHomburg (2019): PaleoCodage repre-sents a machine-readable way to describe highly structured scripts such as cuneiform.This structure, relations to other similar characters and scripts can be modelled usingthe proposed core vocabulary, while the language specific vocabularies would describewriting styles, shapes and stilistic characteristics of the respective script. Vocabulary describing two cuneiform glyphs connected to character rep-resentations, their finding spot, their assigned epoch, an assigned glyph sense (whichmay be distinct from the character/word sense) and possible serializations in SVG andas a PaleoCode. For other languages other encodings are possible.Excerpt: PaleoCodage Vocabulary describing a cuneiform sign. The sign’s structure is described using a PaleoCode which could itself be described usingSemantic relations. These relations allow to model the structure of script subelementswith extensions for paleographic features

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2020
"carrefours / intersections"

Hosted at Carleton University, Université d'Ottawa (University of Ottawa)

Ottawa, Ontario, Canada

July 20, 2020 - July 25, 2020

475 works by 1078 authors indexed

Conference cancelled due to coronavirus. Online conference held at https://hcommons.org/groups/dh2020/. Data for this conference were initially prepared and cleaned by May Ning.

Conference website: https://dh2020.adho.org/

References: https://dh2020.adho.org/abstracts/

Series: ADHO (15)

Organizers: ADHO