Rule-based Speaker Identification for Speech, Thought and Writing in German Literary Texts

Henny Sluyter-Gäthje

Authorship

1. Henny Sluyter-Gäthje

University of Potsdam

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

1. Introduction
To study storyworlds created in literary texts, the analysis of characters and their interaction is one of the most fundamental aspects. A character’s voice in a storyworld can be expressed by speech, thought or writing (STW), the representation of which can take on different forms depending on how truthful it is to the original utterance (Genette, 1983: 171-173). The following types can be differentiated: direct (most truthful), indirect, reported (least truthful) and free indirect (mixture of direct and indirect) STW.
A basis for the automatic processing of storyworlds is the identification of STW units and the attribution to their producers, i.e. the characters. While successful approaches for the recognition of STW units do exist, most speaker attribution systems are limited to direct or indirect speech. In this work, we develop rule-based speaker identification systems for the attribution of not only speech but also thought and writing, not limited to direct and indirect but also including reported and free indirect representations in German literary texts.
2. Related Work
The task of speaker attribution can be divided into two subtasks: the identification of speakers –finding the textual mention of a speaker– and the resolution of speakers –resolving the textual mention to a speaker entity. This work is concerned with speaker identification. Early approaches to speaker attribution mostly relied on pattern matching (e.g. Krestel et al., 2008). Elson and McKeown (2010) presented a first machine learning (ML) approach which formed the basis for follow-up work (direct: O’Keefe et al., 2012; He et al., 2013; Yeung and Lee, 2017; Ek et al., 2018; indirect: Pareti et al., 2013; Newell et al., 2018). Krug et al. (2016) experimented with a rule-based approach for the attribution of direct speech units in German literary texts which could outperform their ML approaches. Similarly, Muzny et al. (2017) built a state-of-the-art system for the domain of English literature which attributed speakers for direct speech in a rule-based way.
3. Approach
This work builds on the approaches of Krug et al. (2016) and Muzny et al. (2017) by adapting and extending the rules they presented. Additionally, we formulate new rules. All rules are compiled, manually evaluated and improved in an iterative way with the help of the
Corpus Redewiedergabe (Brunner et al., 2020a). For each representation type (direct, indirect, reported and free indirect) we build one system with a different set and a different order of rules; a rule can only be applied once. Similar to related work, our systems rely heavily on linguistic annotations (see figure 01) and on predefined word lists (e.g. to identify animate nouns). A final evaluation was performed on a held-out test set extracted from the Corpus Redewiedergabe. The full pipeline of the systems, including
the recognition of STW units (Brunner et al., 2020b), is shown in figure 01. The
systems are publicly available alongside an extensive description of the rules.

Pipeline of the speaker identification systems for the annotation of raw text.

4. Results

Author
STW type

Performance
Range

STW
medium

Domain
Language

Pareti et al. (2013)

Direct
Indirect
Mixed

85 – 91

74 – 79
65 – 81

Speech
News
English

Krug et al. (2016)
Direct
78.4
Speech
Literature
German

Muzny et al. (2017)
Direct
76 – 85
Speech
Literature
English

This work
with gold
STW annotations

Direct
Indirect
Reported
Free indirect

63.91
82.2
71.38
50.0

Speech,
Thought,
Writing

Literature
German

Comparison of the accuracies of speaker attribution systems that were used in setups comparable to this work. Accuracy ranges are indicated as some systems were applied to different data sets with varying success. Maximum values are marked in bold.

As shown in figure 02 our systems achieve the best performance for attributing indirect, reported and free indirect STW. The direct system could be improved, for example when handling conversational patterns. The full pipeline achieves a comparable performance.
5. Future Work
The pipeline itself could be improved (e.g. by extending the predefined word lists) and the systems could be tested on and eventually adapted to another domain. For comparative purposes, neural networks that use semantic word representations could be trained for the task of speaker identification. Finally, the systems could be extended to also resolve speakers. The systems can be used as is to perform analyses in the field of Computational Literary Studies e.g. to address gender related research questions (cf. Schumacher and Flüh, 2020).

Bibliography

Akbik, A., Vollgraf, R. and Blythe, D.
(2018). Contextual String Embeddings for Sequence Labeling. In
27th International Conference on Computational Linguistics
. COLING 2018. Santa Fe, New Mexico, USA: Association for Computational Linguistics, pp. 1638–49.

Brunner, A., Engelberg, S., et al.
(2020). Corpus REDEWIEDERGABE. In
Proceedings of the 12th International Conference on Language Resources and Evaluation (LREC’20)
. LREC. Marseille, France: European Language Resources Association, pp. 803–12.

Brunner, A., Duyen, N., et al.
(2020). To Bert or Not to Bert–Comparing Contextual Embeddings in a Deep Learning Architecture for the Automatic Recognition of Four Types of Speech, Thought and Writing Representation. In
Proceedings of the 16th Conference on Natural Language Processing (KONVENS 2020)
. Konvens. Zurich, Switzerland.

Ek, A. et al.
(2018). Identifying Speakers and Addressees in Dialogues Extracted from Literary Fiction. In
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)
. LREC. Miyazaki, Japan.

Elson, D. and McKeown, K.
(2010). Automatic Attribution of Quoted Speech in Literary Narrative. In
Twenty-Fourth AAAI Conference on Artificial Intelligence
. AAAI. AAAI Press, pp. 1013–9.

Genette, G.
(1990).
Narrative Discourse: An Essay in Method
. 1. publ., 4. print. Ithaca: Cornell University Press.

He, H., Barbosa, D. and Kondrak, G.
(2013). Identification of Speakers in Novels. In
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics
. Sofia, Buldaria: Association for Computational Linguistics, pp. 1312–20.

Krestel, R., Bergler, S. and Witte, R.
(2008). Minding the Source: Automatic Tagging of Reported Speech in Newspaper Articles. In
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08)
. Marrakech, Morocco: European Language Resources Association.

Krug, M. et al.
(2016). Attribuierung Direkter Reden in Deutschen Romanen Des 18.-20. Jahrhunderts. Methoden Zur Bestimmung Des Sprechers Und Des Angesprochenen. In
DHd 2016, Modellierung - Vernetzung - Visualisierung, Die Digital Humanities Als Fächerübergreifendes Forschungsparadigma, Konferenzabstracts
. 124-130. DHd. Leipzip, Germany.

Muzny, F. et al.
(2017). A Two-Stage Sieve Approach for Quote Attribution. In
In Proceedings of the 15th Conference of the European Chapter of the Association for Computation al Linguistics
. Valencia, Spain: Association for Computational Linguistics, pp. 460–70.

Newell, C., Cowlishaw, T. and Man, D.
(2018). Quote Extraction and Analysis for News. In
Proceedings of KDD Workshop on Data Science Journalism and Media (DSJM)
. New York, NY, USA: Association for Computing Machinery.

O’Keefe, T. et al.
(2012). A Sequence Labelling Approach to Quote Attribution. In
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning
. Jeju Island, Korea: Association for Computational Linguistics, pp. 790–9.

Pareti, S. et al.
(2013). Automatically Detecting and Attributing Indirect Quotations. In
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing
. Seattle, Washington, USA: Association for Computational Linguistics, pp. 989–99.

Schumacher, M. and Flüh, M.
(2020). M*w Figurengender Zwischen Stereotypisierung Und Literarischen Und Theoretischen Spielräumen: Genderstereotypen Und -Bewertungen in Der Literatur Des 19. Jahrhunderts. In
DHd 2020, Spielräume, Digital Humanities Zwischen Modellierung Und Interpretation, Konferenzabstracts
. Paderborn, Germany, pp. 162–6.

Sennrich, R., Volk, M. and Schneider, G.
(2013). Exploiting Synergies between Open Resources for German Dependency Parsing, Pos-Tagging, and Morphological Analysis. In
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013
. Shoumen, Bulgaria: INCOMA Ltd, pp. 601–9.

Yeung, C. Y. and Lee, J.
(2017). Identifying Speakers and Listeners of Quoted Speech in Literary Works. In
Proceedings of the Eighth International Joint Conference on Natural Language Processing
. Taipei, Taiwan: Asian Federation of Natural Language Processing, pp. 325–9.

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022

"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website: https://dh2022.adho.org/

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO

Rule-based Speaker Identification for Speech, Thought and Writing in German Literary Texts

1. Henny Sluyter-Gäthje

ADHO - 2022

"Responding to Asian Diversity"