Quantifying narrative perspective in Ancient Greek: Narrator language and character language in Thucydides

paper, specified "long paper"
  1. 1. Leopold Hess

    Radboud University

  2. 2. Corien Bary

    Radboud University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Narrative perspective is the fascinating, but poorly understood in linguistic terms, phenomenon whereby literary texts often present events through the eyes, or minds, of a character in the story. (1), from Jane Austen’s Emma, for example, is presented from within Emma’s consciousness:
(1) The hair was curled, and the maid sent away, and Emma sat down to think and to be miserable. – It was a wretched business, indeed! – Such an overthrow of everything she had been wishing for. – Such a development of everything most unwelcome! – Such a blow for Harriet! – That was the worst of all.
Narrative perspective is even more difficult to investigate in texts written in ancient languages, such as Ancient Greek, for at least two important reasons: one, the absence of the stylistic device of Free Indirect Discourse which is one the main ways of manipulating narrative perspective in modern literature (Bary and Maier, 2014); two, the impossibility of querying native speaker intuitions about fine nuances of meaning and use. In response to these difficulties, in this paper we present a quantitative approach to studying narrative perspective in the Attic historian Thucydides, an author who today is still highly esteemed for his dramatic uses of perspective shifts. His manipulation of perspective is perceived as very subtle and nuanced and classicists up to today have tried to get a grip on it. (Recent studies include Bakker, 1997; Gribble, 1998; Grethlein, 2013a, 2013b; Allan 2013, 2018.) We provide a new quantitative approach to this question based on statistical analyses of the distribution of vocabulary in direct speech reports (“character text”) and outside of such reports (“narrator text”). The grounding of narrative perspective in the use of language at the level of lexical choice has been observed in narratology and (text) linguistics (cf. Fludernik, 1993; Sanders, 1994; Dancygier, 2012; Eckardt, 2015, to mention but a few important studies), but quantitative studies are rare. The most important contribution for Ancient Greek in this respect is constituted by de Jong’s (2001 and 2004) seminal narratological studies of Homer. As one element of her analysis, de Jong contrasts “character language” with “narrator language” in the Homeric epics. “Character language” comprises expressions that appear predominantly in speeches and rarely in narrator text. They are typically charged, evaluative or emotive, words. Their scarcity outside speeches contributes to the impression of Homer’s narrator as impersonal and objective. When words from character language do - infrequently - appear in narrator text, they still are to be interpreted as conveying a character’s perspective, and they are essential to the creation of certain narrative effects. This may happen in indirect discourse, as ἀλείτης ‘the sinner’ in (2), but also outside speech representation, as ἐνηής ‘gentle’ in (3):
(2) φάτο γὰρ τίσεσθαι
‘he (Menelaos) thought to himself that he would take revenge upon the
Homer, Iliad 3.28

(3) κλαίοντες δ᾿ ἑτάροιο
ἐνηέος ὀστέα λευκὰ ἄλλεγον
‘weeping, they (the Greeks) collected the white bones of their
gentle companion Patroclus’
Homer, Iliad 23.252-3

Both ἀλείτης ‘the sinner’ and ἐνηής ‘gentle’ are to be evaluated from the character’s rather than narrator’s perspective, as de Jong (2004) argues on the basis of their distribution in the text.
We apply a similar analysis to Thucydides. However, we proceed in a highly automated way: rather than deciding first which words are potentially interesting evaluative and emotive words and then counting their occurrences in narrator and character text (as de Jong did), we identify character language by looking at the relative frequencies of all words within and outside character speeches and identifying those whose distribution is the most skewed. They need not correspond only to highly charged vocabulary that a narratologist conducting a manual analysis would identify beforehand. This procedure makes it possible to uncover even subtler perspectival effects achieved by the narrator with the use of expressions that could intuitively seem entirely ‘descriptive’ or non-perspective-sensitive.
For our analysis we preprocessed the text in the following way. We created a lemmatized version of the text, so that we could retrieve frequencies of lemmas rather than inflected wordforms. The lemmatization was done with the help of GLEM, operating by combination of lexicon look-up and memory-based learning, which has been found to out-perform, for Ancient Greek, lemmatizers using only one of these components (Bary et al., 2017). We divided the text of Thucydides’ Histories into separate files corresponding to chapters in modern editions, and segregated those files into three subsets: CT - character text, containing passages of direct speech, QT containing (purported) quotations made by Thucydides from existing documents and literary sources, and NT - all the rest, i.e. narrator text. We disregarded QT as passages in which lexical choice was not in Thucydides’ full control. (CT, by contrast, contains speeches which, even if based on real utterances, were written up and stylized by the historian.) Splitting
Histories into separate chapter-files and two subsets meant treating the text of Thucydides as a corpus divided into two sub-corpora, which allowed the application of corpus-like methods.

Our investigation proceeded in two steps. First, we calculated relative frequencies of all lemmas in the sub-corpora and compared those that occurred in both, looking to identify those that are importantly more frequent in CT than NT, but do sometimes appear in NT, as these would be, by hypothesis, the character language words that may contribute to narrative perspective. We assessed the differences in distribution of lemmas between CT and NT using log-likelihood ratio (cf. Dunning, 1993; Rayson and Garside, 2000) and ranked them according to how strongly skewed the distribution was in favor of CT. Lemmas that exhibit a most skewed distribution can be called “character language” words; they are those words that appear predominantly in character speech, and as such they are the precise object of our interest here. We then proceeded to categorize and analyze the NT occurrences of the top ten of highest-ranked lemmas (excluding proper names, function words, extremely rare and extremely frequent words) focusing on their role with respect to narrative perspective in the textual context in which they appear.
Table 1: Top 10 character language lemmas (content words only)






dikaios ‘just’



adikeō ‘do wrong’



agathos ‘good’



khrē ‘should be’



kindunos ‘danger’



amunō ‘defend’



aretē ‘virtue’



isos ‘equal’



aiskhros ‘shameful’



paskhō ‘suffer’


Second, we hypothesized that character language lemmas may cluster together in the text in passages that are especially perspectivally rich or dramatic. To test that we treated each chapter of NT as an individual document and ranked them based on how many occurrences of most salient character language lemmas they contain, relative to size (a set of such lemmas was identified through application of additional filters excluding extremely rare and extremely frequent words; additionally the lemmas were weighted based on the LL ratio of their CT vs NT frequency). We then studied the content of the highest-ranking chapters with respect to their narrative mode and perspectival effects.
We found that character language lemmas in NT (i.e. in Thucydides narrator text) tend to occur predominantly in indirect reports, both at the level of individual occurrences and chapters - this is to be expected, as indirect reports attribute thoughts and words to the perspective of a character rather than that of the narrator. More importantly, we also found that where character language appears in NT outside of reportative contexts it is also most often used to express a perspective or mode different than that of the default objective narrator - either the perspective of a character (in context that can be described as “focalized” in narratological terms) or that of Thucydides commenting on the events or on general truths. Furthermore, we found that our automated and quantitative approach allowed us to identify as character language words that would not
prima facie stand out as evaluative or subjective in any way, but are in fact used by Thucydides in a perspectivally “charged” way. One prominent example is κίνδυνος
kindynos, meaning ‘danger’, as in the following passage, where it refers to Nikias’ reasoning and therefore his perception of potential dangers.

(4) ...ὁ δὲ τὰ κατὰ τὸ στρατόπεδον διὰ φυλακῆς μᾶλλον ἤδη ἔχων ἢ δι᾿ ἑκουσίων
κινδύνων ἐπεμέλετο.
[Nikias] attended to the affairs of his army, keeping it from this time on the defensive to avoid any unnecessary
Thuc. 7.8.3

In conclusion, we saw that quantitatively identifiable “character language” contributes to narrative perspective in Thucydides text, and our method allowed us to identify otherwise elusive aspects of the historian’s narrative style. Moreover, our findings are important for linguistic theories of perspective-sensitivity, suggesting that it may be a matter of pragmatics and patterns of use at least as much as of semantics.
As much as being a study of Thucydides, this paper provides a proof of concept for a method of addressing narratological questions with the use of quite simple, but powerful quantitative corpus-based techniques.

Allan, R. (2013). History as presence. Time, tense and narrative modes in Thucydides. In Tsakmakis, A. and Tamiolaki, M. (eds)
Thucydides between History and Literature, Berlin, Boston: De Gruyter.

Allan, R. (2018). Herodotus and Thucydides: distance and immersion. In van Gils, L. W., de Jong, I. and Kroon, C. H. M. (eds)
Textual Strategies in Greek and Latin War Narrative, Leiden: Brill.

Bakker, E. (1997). Verbal aspect and mimetic description in Thucydides. In Bakker, E. (ed)
Grammar as Interpretation: Greek Literature in its Linguistic Contexts, Leiden: Brill.

Bary, C. and Maier, E. (2014). Unembedded Indirect Discourse.
Proceedings of Sinn und Bedeutung 18.

Bary, C., Berck, P. and Hendrickx, I. (2017). A Memory-Based Lemmatizer for Ancient Greek.
Proceedings of the 2nd International Conference on Digital Access to Textual Cultural Heritage.

Dancygier, B. (2012).
The Language of Stories: a Cognitive Approach, Cambridge: Cambridge University Press.

Dunning, T. (1993). Accurate methods for the statistics of surprise and coincidence,
Computational Linguistics 19.1.

Eckardt, R. (2015). The Semantics of Free Indirect Discourse: How Texts Allow Us to Mind-read and Eavesdrop, Brill.
Fludernik, M. (1993).
The Fiction of Language and the Languages of Fiction, London, New York: Routledge.

Grethlein, J. (2013a).
Experience and Teleology in Ancient Historiography: “future pasts” from Herodotus to Augustine, Cambridge: Cambridge University Press.

Grethlein, J. (2013b). The presence of the past in Thucydides. In Tsakmakis, A. and Tamiolaki, M. (eds)
Thucydides between History and Literature, Berlin, Boston: De Gruyter.

Gribble, D. (1998). Narrator interventions in Thucydides,
The Journal of Hellenic Studies 118.

de Jong, I. (2001).
A Narratological Commentary on the Odyssey, Cambridge: Cambridge University Press.

de Jong, I. (2004).
Narrators and Focalizers. The Presentation of the Story in the Iliad, Bristol: Bristol Classical Pres.

Maier, E. (2015). Quotation and Unquotation in Free Indirect Discourse,
Mind & Language 30.

Rayson, P. and Garside, R. (2000). Comparing corpora using frequency profiling,
Proceedings of CompareCorpora ’00.

Sanders, J. (1994).
Perspective in narrative discourse. PhD thesis. Tilburg University.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2019

Hosted at Utrecht University

Utrecht, Netherlands

July 9, 2019 - July 12, 2019

436 works by 1162 authors indexed

Series: ADHO (14)

Organizers: ADHO