Modelling the Interpretation of Literary Allusion with Machine Learning Techniques

poster / demo / art installation
  1. 1. Neil Coffee

    Department of Classics - University at Buffalo, State University of New York (SUNY)

  2. 2. James Gawley

    Department of Classics - University at Buffalo, State University of New York (SUNY)

  3. 3. Christopher W. Forstall

    Department of Classics - University at Buffalo, State University of New York (SUNY)

  4. 4. Walter J. Scheirer

    Harvard University

  5. 5. David Johnson

    Dept. of Computer Science and Engineering - University at Buffalo, State University of New York (SUNY)

  6. 6. Jason Corso

    Dept. of Computer Science and Engineering - University at Buffalo, State University of New York (SUNY)

  7. 7. Brian Parks

    Dept. of Computer Science - University of Colorado at Colorado Springs

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

A Computational Perspective on Allusion
Most literary allusion, the deliberate evocation by one text of a passage in another, is based upon text reuse. Yet most instances of textual similarity are not meaningful literary allusions. The goal of the Tesserae project ( is to automatically detect allusion in a corpus of literary texts, primarily Classical Latin poetry. We begin with a large set of textual parallels, and then attempt to model which of these instances of text reuse are meaningful literary allusions and which are not, according to a group of human readers. While initial attempts with a few basic textual features have proven surprisingly effective, here we employ a more complex feature set and machine learning techniques drawn from the field of computer vision in an attempt to improve the results. Novel applications of machine learning, beyond the well known but constrained textual classification tasks of attribution and categorization, have the potential to be transformative for complex analysis tasks in the Digital Humanities.
Benchmark Data
As an illustration, we consider textual parallels between Book 1 of Lucan’s Bellum Civile and the entirety of Vergil’s Aeneid(Coffee et al. 2012). Our benchmark dataset comprises a list of 3,400 pairs of sentences that share at least two different words. Each of these pairs has been read and graded for its literary significance by a group of students and faculty working in small teams. These annotator rankings range from 1 (no literary significance) to 5 (pointed literary allusion).
Learning Relevant Features
Earlier work showed that high-ranked parallels could be distinguished from the others with modest accuracy using only word frequency, distance between words, and the presence of exact form matching versus differently-inflected forms of the same word (Gawley et al. 2012). Nevertheless, others have recommended more sophisticated approaches to this problem (Bamman and Crane 2008). Here we consider an expanded feature set including bi-gram frequency, frequency of individual words, character-level n-grams and edit distances. Our goal is to learn relevant combinations of features in the presence of often incomplete data.
Recent work by members of our team has developed new methods for tuning machine-learning using support vector machines (Scheirer et al. 2012) and random forests (Xiong et al. 2012). Random forest is of particular interest, providing robust feature selection that shows promise for literary analysis (Tabata 2012). The problem of missing data is prevalent in all areas of literary study, but is not well addressed by existing algorithms in common use by digital humanists. This is especially true for ancient texts, where we often find a significant gap in the manuscript tradition. Using principled strategies for imputation and marginalization, we reduce the impact on the results.
Results and Implications
Our ability to learn the difference between high-ranked parallels (ranks 4 & 5) and low-ranked parallels (ranks 1 & 2) forBellum Civile and the Aeneid is strong: random forest achieves an average AUC score between 82% and 83%, while linear SVMs yield an average score of 81.5%. This suggests that quantifiable patterns do exist across allusions, which can be captured algorithmically. In this ongoing research we seek a more successful model of literary significance that will allow our software to put interesting allusions at the top of the list; at the same time, we hope it will also cast new light on the underlying structures of our experience of literature.
An interactive demonstration of the Tesserae allusion detection tool accompanies this poster.
Bamman, D., and G. Crane (2008). The Logic and Discovery of Textual Allusion. LaTeCH
Coffee, N., J.-P. Koenig, S. Poorima, C. W. Forstall, R. Ossewaarde, and S. Jacobson (2012). Intertextuality in the digital age. Transactions of the American Philological Association, 142(2).
Gawley, J., C. W. Forstall, and N. Coffee (2012). Evaluating the literary significance of text re-use in Latin poetry. DHCS
Scheirer, W. J., A. Rocha, J.Parris, and T. E. Boult (2012). Learning for meta-recognition. IEEE T-IFS 7.
Tabata, T. (2012). Approaching Dickens’ style through random forests. DH.
Xiong, C., D. Johnson, R. Xu, and J. J. Corso (2012). Random forests for metric learning with implicit pairwise position dependence. ACM SIGKDD.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2013
"Freedom to Explore"

Hosted at University of Nebraska–Lincoln

Lincoln, Nebraska, United States

July 16, 2013 - July 19, 2013

243 works by 575 authors indexed

XML available from (still needs to be added)

Conference website:

Series: ADHO (8)

Organizers: ADHO