Shared Tasks in Computational Literary Studies: Guideline Creation, Annotation and Text Generation for the Analysis of Narrative Levels

paper, specified "short paper"
  1. 1. Svenja Simone Guhr

    Technische Universität Darmstadt (Technical University of Darmstadt)

  2. 2. Nils Reiter

    Universität zu Köln (University of Cologne)

  3. 3. Sina Zarrieß

    University of Bielefeld

  4. 4. Evelyn Gius

    Technische Universität Darmstadt (Technical University of Darmstadt)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

With this short presentation we would like to give an update on our ongoing shared tasks for text annotation. In natural language processing, shared tasks are well established and highly productive frameworks for bringing together researchers. Organizers define a research task and the conditions for a competition-like setting in which others can submit their approaches to the task. We believe that shared tasks are a productive way of collaboration for the digital humanities and thus we hope that our presentation encourages others to organize shared tasks. We started to design our shared tasks on the identification of narrative levels in 2017. Here, we report on the successful conclusion of our first shared task on guideline creation and give an outlook on the subsequent task of automated annotation.
The goal of our shared task is the annotation of narrative levels. Automated segmentation of prose texts into meaningful units remains challenging, but would enable more robust and meaningful subsequent processing tasks, such as authorship determination, computation of topics, sentiment analysis, or procedures for determining semantic similarities, because they can be better connected to literary studies theories and assumptions (e.g., the characteristics property of an author might be a certain rhythm in their texts). In addition, any content-related analysis of narrative texts (e.g., about the characters or the plot) greatly benefit from knowledge of the level structure. Narrative levels are segments of prose texts determined by a largely stable expression of their fictional world in terms of space, time, and events, as well as the mediation of the events by a narrator.

First Shared Task on the Systematic Annotation of Narrative Levels
In an initial shared task, which we had announced in a presentation at DH2017 (Reiter et al., 2017), eight teams of linguists, computational linguists and literary scholars developed guidelines for the annotation of narrative levels. These reflect the diverse, heterogeneous theories of narrative levels. (Cf. Gius et al. (2019) for the setting and the guidelines of the initial round and Gius et al. (2021) for the guidelines of the second round and a final overall reflection on the shared task.)
Each of the submitted guidelines had strengths and weaknesses that emerged in the evaluation (Gius et al., 2019; Gius et al., 2021).

Second Shared Task on the Automatic Annotation of Literary Texts and Generation of Training Data
A second shared task with a different focus is currently in preparation. It uses the level-annotated texts from the first shared task as the basis for creating a training and test set for developing automated level recognition methods. In addition to automating level recognition, it also addresses the automated generation of training data, since manual annotation is expensive and time-consuming.
The second shared task will consist of two connected but independently solvable tracks, and it will focus on the English language.

Track One: Automating Generation of Level-Annotated Training Data
Track participants address the task to develop procedures for the automatic generation or synthesis of texts that are organized in terms of different narrative levels. We expect participants to experiment with different rule-, template- and language-model/AI-based techniques to assemble texts that display typical patterns of narrative level changes. The resulting texts will be used as training data for a BERT-based baseline system that automatically learns to detect narrative levels. Hence, the objective of narrative level generation is not primarily to produce high-quality, aesthetically pleasing text, but rather to obtain data of sufficient quantity and quality for supporting machine learning techniques in narrative level detection.

Track Two: Automatic Detection of Narrative Levels in Literary Prose
The second track builds on experiments on automatic detection of narrative level boundaries using a pre-trained BERT-model to detect artificially generated level boundaries that we conducted in 2021 (Reiter et al., 2022).
Participants will use annotated training data (some of which will be generated in the first track) to develop systems for automatic detection of narrative levels in texts.

The evaluation of the first track will be done via the performance of the BERT-based recognition model, using the generated data as training data. Thus, the generated texts are measured on whether they contain well-recognizable narrative levels that are useful for training. The second track will be evaluated using manually annotated gold data from the first shared task in 2020.


Devlin, J., Chang, M.-W., Lee, K., Toutanova, K. (2019). BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. DOI 10.48550/arXiv.1810.04805.

Gius, E., Willand, M., Reiter, N. (2021). On Organizing a Shared Task for the Digital Humanities – Conclusions and Future Paths.
Journal of Cultural Analytics, 6(4). DOI 10.22148/001c.30697.

Gius, E., Reiter, N., Willand, M. (2019). A Shared Task for the Digital Humanities Chapter 2: Evaluating Annotation Guidelines.
Journal of Cultural Analytics, 4(3). DOI 10.22148/16.049.

Reiter, N., Sieker, J., Guhr, S., Gius, E., Zarrieß, S. (2022). Exploring Text Recombination for Automatic Narrative Level Detection.
Proceedings of the 13th Conference on Language Resources and Evaluation (LREC 2022).

Reiter, N., Gius, E. Strötgen, J., Willand, M. (2017). A Shared Task for a Shared Goal: Systematic Annotation of Literary Texts.
Book of Abstracts. DH 2017. Montreal, Canada: ADHO.

Reiter, N., Willand, M., Gius, E. (2019). A Shared Task for the Digital Humanities Chapter 1: Introduction to Annotation, Narrative Levels and Shared Tasks.
Journal of Cultural Analytics, 4(3), DOI 10.22148/16.048.

Ryan, M.-L. (1991). Possible Worlds,
Artificial Intelligence, and Narrative Theory. Bloomington: Indiana University Press.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website:

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO