The Latin of Matthew of Cracow (c. 1345 - 1410) - a corpus based study of his language and style

poster / demo / art installation
  1. 1. Jagoda Chmielewska

    Institute of Polish Language - Polish Academy of Sciences

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The poster illustrates the problems arising when attempting to describe the language used by the author flourishing at the turn of the Middle Ages and the modern era. It shows how a corpus-based linguistic analysis can tell us more about author's language and stylethan the traditional stylometric techniques. It also presents an example of the authorship attribution research of an anonymous medieval treaty carried out with the help of digital tools.
Matthew of Cracow (c. 1345–1410) is considered as one of the greatest theologians of the Eastern Europe Middle Ages as well as one of the most important representatives of the Polish Scholasticism. He was born in Cracow, probably into a family of German ancestry. After completing his initial education in his native city, he went to Prague, where he entered the Charles University and where he became professor of theology. He was involved in the establishment of the University of Heidelberg, University of Chelmno (Poland) and later in the project of bringing to life the Academy of Cracow. He died on March 5, 1410, in Heidelberg. Matthew belonged to the followers of the via moderna trend of Scholasticism and is known as a zealous reformer of the Catholic Church. He is an author of more than 70 treatises, from which the most famous are Rationale operum divinorum, De squaloribus Curiae Romanae and Opuscula Theologica (Nuding, 2007).
The poster will present the results of the Matthew's idiolect analysis. It will focus on language change between the late Middle Ages and the early Renaissance, as one of the main goals of the study is to trace two technolects used by authors living in this period, namely Scholastic and Humanistic Latin. Scholastic Latin is defined as a variant of late Medieval Latin, the language of science and the university community in use between 12th and 14th century (Herren, 1996: 124). Its main characteristics are excessive formalism, the use of impersonal narrati on and the pursuit of semantic precision to the detriment of aesthetic and literary values. As a result, Scholastic Latin became a highly technical language, employing countless number of theological and philosophical terms (Bourgain and Hubert, 2005: 62). With the raise of the Renaissance the language of the scientific treatises underwent a significant transformation has been transformed. Imitation of the works of Virgil and Cicero made scholars to change their own way of writing. From now on they will be using the idiomatic structure, the lexis and the rhetorical and stylistic techniques characteristic – at least in their opinion – of the Classical authors (Knight and Tilg, 2015: 4; Tunberg, 1996: 130). Many of the features of Matthew's idiolect seem to follow this new tendency and it is very important to establish how big influence it had on the style of his writings.
The research is based on the corpus of Matthew's works which includes five editions published between 1930-2011 and consists of over 60000 words. It contains both his original treaties, namely De contractibus, De squaloribus Curiae Romanae, Sermones de sanctis, Lectura super Beati Immaculati, Opuscula Theologica, and an anonymous medieval work, often attributed to Matthew – Ars Moriendi.
The present study uses the methods developed by the computational stylometry and corpus linguistics and includes among others concordance, collocation and keywords analysis. The software employed for this purpose is the TXM platform – an open-source environment based on CQP and R and providing tools for NLP pre-processing, quantitative analysis together with clear metadata model (Heiden, 2010). Thanks to the TreeTagger coupled with the Latin language model file, TXM makes it possible to annotate Latin text with PoS and lemma tags, and as a result allows to perform advanced queries. In addition to frequency lists for any token property (type, lemma, PoS), the built-in CQP search engine lets generate lists for syntactic constructions specific to Classical and/or Medieval Latin, such as the accusative with infinitive or the possessive dative construction. The R package stylo (Eder et al., 2013), on the other hand, allows to trace similarities between anonymous Ars Moriendi treatise and the original Matthew’s works.
In thatthe poster shows how the use of the Digital Humanities methods can support the traditional analysis of a writer's language and style and help to ascertain the authorship of various texts coming from different ages.


Bourgain, P. and Hubert, M. C. (2005).
Le latin médiéval. Turnhout: Brepols.

Eder, M., Kestemont, M. and Rybicki, J. (2013). Stylometry with R: a suite of tools.
Digital Humanities 2013: Conference Abstracts. Lincoln (NE): University of Nebraska-Lincoln, pp. 487-89.

Heiden, S. (2010). The TXM Platform: Building Open-Source Textual Analysis Software Compatible with the TEI Encoding Scheme.
24th Pacific Asia Conference on Language, Information and Computation 2010: Conference Abstracts.Sendai: WU, pp. 389-98.

Herren, M. W. (1996). Latin and the vernacular languages. In: Mantello F. A. C., Rigg A. G. (eds.),
Medieval Latin: An Introduction and Bibliographical Guide. Washington DC: Catholic University of America Press, pp. 122-29.

Knight, S. and Tilg, S. (2015).
The Oxford Handbook of Neo-Latin. Oxford University Press.

Nuding, M. (2007).
Matthäus von Krakau. Tübingen: Mohr Siebeck.

Tunberg, T. O. (1996). Humanistic Latin. In Mantello F. A. C. and Rigg A. G. (eds.),
Medieval Latin: An Introduction and Bibliographical Guide. Washington DC: Catholic University of America Press, pp. 128-35.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2016
"Digital Identities: the Past and the Future"

Hosted at Jagiellonian University, Pedagogical University of Krakow

Kraków, Poland

July 11, 2016 - July 16, 2016

454 works by 1072 authors indexed

Conference website:

Series: ADHO (11)

Organizers: ADHO