Genre Classification in English Poetry with Lexical and Prosodic Features

paper, specified "short paper"
  1. 1. Wenyi Shang

    School of Information Sciences, University of Illinois at Urbana-Champaign, United States of America

  2. 2. Ted Underwood

    School of Information Sciences, University of Illinois at Urbana-Champaign, United States of America; Department of English, University of Illinois at Urbana-Champaign, United States of America

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

There has been an abundance of research on the lexical (Bergel et al., 2016) and prosodic (Anttila and Heuser, 2016) features of poetic texts. Some recent attempts such as Šeļa et al. (2020) combine the two feature sets to model the association between poetic meter and meaning. In this project, we ask how the relative importance of lexical and prosodic features vary across different forms and genres. We pose this question by classifying different categories of English poetry with lexical and prosodic features separately. Then we compare the results of the parallel experiments, to ask which categories have a stronger lexical or prosodic character.
We collected all 37,700 English poems that are tagged with a certain “genre” from the Chadwyck-Healey Literature Collections ( (Note that our use of the term “genre” is drawn from our source; scholars might well characterize some of these genres as “forms” or “modes.”) Since many of the 44 total genres only have a few cases, we kept the poems of the top 8 genres only, which consist of 30,704 (81.44%) poems. Next, we trained a lexical classifier and a prosodic classifier and extracted features for each of them. For the lexical classifier, the features were extracted by transforming the texts into an array of word frequencies, and a grid search was conducted to select the algorithm and number of features used for classification. Optimization is achieved when the random forest algorithm and top 500 features are used. As for the prosodic classifier, we used the Python library “Poesy” ( to obtain “foot type”, “feet number”, and “rhyme style” of each poem. Once again, the best performance is achieved when the random forest algorithm is used.
We then conducted classification experiments based on a random train-test split (70% training vs 30% testing) and compared the performances of the two classifiers on an 8-genre classification task and a 2-class classification task, where we classified each of the top 8 forms against all others (e.g., “sonnet” vs “non-sonnet”, “ballad” vs “non-ballad”).

Figure 1. Confusion matrix of models trained with different feature sets (8-genre classification)
In the 8-genre classification experiment, while the lexical and the prosodic classifier have similar overall performances, their results on different genres significantly differ: the lexical classifier classifies ballads and metrical psalms very well, but easily confuses heroic couplets with sonnets and epigrams, while the prosodic classifier classifies heroic couplets very well but easily confuses ballads and metrical psalms with lyrics.

Figure 2. F1 score of the classification experiments of different genres trained with different feature sets (2-class classification)
The results of 2-class classification demonstrate a similar pattern: sonnets and heroic couplets are better distinguished from other forms when using classifiers trained with prosodic features, while ballads and metrical psalms are better distinguished from other forms when using classifiers trained with lexical features. These results preliminarily suggest that different genres are distinguished with different features: while ballads and metrical psalms are distinguished by the diction they use, sonnets and heroic couplets are defined by prosodic features.
We also examined the performance of both classifiers on poems by different authors. Here, we used the “hold-out-one” strategy: we selected all poems by authors with at least 30 poems in the dataset and trained the classifiers with all poems except those written by an author, and tested the results on the poems written by that author.

Figure 3. Performances of classifiers on poems of different poets
In figure 3, most authors of translated works are in the bottom of the figure (their works cannot be easily classified by prosodic features). We also see preliminary evidence that there might be a correlation between poetic prominence and prosodic regularity: the prominent poets mostly appear in the upper half (their works can be well classified by prosodic features). It is likely that the prominent poets continuously stick to certain prosody conventions for each genre, while the translated poems use very different prosody in the same genre, which is understandable as prosodic pattens are easily lost in translation. The accuracy of classification with lexical features does not show a consistent pattern. However, two influential early modern authors, Spencer and Shakespeare, appear in the top-right corner (their works can be very well classified by both lexical and prosodic features), perhaps indicating that they set the standard for following authors in both the diction and prosody used in different genres.
To further explore these observations, in the future we will further investigate the relationships between the accuracy of prediction and factors such as the date of the poem, the poets’ canonicity, and their nationality. Additionally, we will also take word order into consideration and use N-gram in supplement of single-word frequencies.
The findings above already tell us a lot of things about the history of English poetry like the roles of lexical and prosodic features in poetry genre classification, and how such roles differ in different genres. However, the most important potential contribution of this project is to distinguish the concepts of “genre” and “form”: if some poetic categories are best identified by prosody, and others by “content” (represented by lexical features), it might be possible to disentangle “form” from other aspects of “genre.” For example, it is observed that heroic couplets and epigrams use similar diction, what distinguish them as two “genres” is their “forms”; in contrast, ballads and lyrics are in similar “forms”, and they are considered as different “genres” because of the words they use.
This could reinforce the argument of King (2021) that “genre” operates on a larger scale than “form”. Furthermore, in addition to understanding how the transformation of poetic “forms” was related to narratives of culture (Martin, 2012), insights can also be gained on the evolution of the roles of “content” and “form” in defining “genres” of poems by different poets and in different periods and locations. While various follow-up experiments need to be done, there is preliminary evidence that the works of prominent poets tend to be close to a prosodic prototype for a genre.


Anttila, A. and Heuser, R. (2016). Phonological and Metrical Variation across Genres. In Hansson, G. Ó., Farris-Trimble, A., McMullin, K. and Pulleyblank D. (eds),
Proceedings of the 2015 Annual Meetings on Phonology. Washington, D.C.: Linguistic Society of America.

Bergel, G., Howe, C. J. and Windram, H. F. (2016). Lines of Succession in an English Ballad Tradition: The Publishing History and Textual Descent of The Wandering Jew’s Chronicle.
Digital Scholarship in the Humanities,
31(3), 540–562.

King, R. S. (2021). The Scale of Genre.
New Literary History,
52(2), 261–284.

Martin, M. (2012).
The Rise and Fall of Meter: Poetry and English National Culture, 1860–
1930. Princeton: Princeton University Press.

Šeļa, A., Orekhov, B. and Leibov, R. (2020). Weak Genres: Modeling Association Between Poetic Meter and Meaning in Russian Poetry. In Karsdorp, F., McGillivray, B., Nerghes, A. and Wevers, M. (eds),
Proceedings of the Workshop on Computational Humanities Research (CHR 2020). CEUR Workshop Proceedings, pp. 12–31.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website:

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO