Deciphering Lyrical Topics in Music and Their Connection to Audio Feature Dimensions Based on a Corpus of Over 100.000 Metal Songs

poster / demo / art installation
  1. 1. Isabella Czedik-Eysenberg

    Universität Wien (University of Vienna)

  2. 2. Oliver Wieczorek

    Otto-Friedrich-University of Bamberg

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

As audio and text features provide complementary layers of information on songs, a combination of both data types has been shown to improve automatic classification of high-level attributes in music such as genre, mood and emotion (Neumayer and Rauber, 2007; Laurier et al., 2008; Hu and Downie, 2010; Kim, 2010). Multi-modal approaches interlinking these features offer insights into possible relations between lyrical and musical information (see Nichols et al., 2009; McVicar et al., 2011; Yu et al. 2019).
Therefore, we examine the connection between audio features and the lyrical content of metal music by combining automated extraction of high-level music properties and quantitative text analysis on a large corpus of music from this genre. Sound dimensions like loudness, distortion and particularly “hardness”/“heaviness” play an essential role in defining the sound of metal music (Reyes, 2008; Mynett, 2013; Herbst, 2017). Topics typically ascribed to metal lyrics contain sadness, death, freedom, nature, occultism or unpleasant/disgusting objects and are often labeled as brutal, dystopian or satanistic (Bayer, 2016; Cheung and Feng, 2019; Podoshen et al., 2014).
By combining both audio feature and text analysis, we (1) offer a comprehensive overview of the lyrical topics present within the metal genre and (2) are able to address whether or not levels of "hardness” and other music dimensions are associated with the occurrence of brutal (and other) textual topics.
Methodically, our research builds on a previous examination (Czedik-Eysenberg, 2018), in which ratings were obtained for 212 music stimuli by 40 raters each. Based on this music perception study, prediction models for the automatic extraction of high-level audio feature dimensions like “hardness” and “darkness” had been trained via machine learning methods.
Now, in our most recent complementary step, we programmed a web crawler to automatically retrieve metal music lyrics from, resulting in a sample of 152.916 song texts. After cleaning procedures, our subsample included 124.288 texts. We applied latent Dirichlet allocation (Blei et al., 2003) on the remaining subsample to construct a probabilistic topic model. Log-perplexity and log UMass coherence were used as goodness-of-fit measures evaluating topic models ranging from 10 to 100 topics. We then examined the most salient and most typical words for each topic. Additionally, within the topic configuration, we identified a latent dimension where philosophical and brutal topics oppose texts with shallow content.
Results of the investigation for relations between high-level audio features and lyrical topics will be presented and the topic models will be displayed to visitors during the poster session for interactive exploration.


Bayer, G. (2016). Images of human-wrought despair and destruction: Social critique in British apocalyptic and dystopian metal. In Gerd Bayer (ed),
Heavy metal music in Britain. Routledge, pp. 101-22.

Blei, D. M., Ng, A. Y.,
Jordan, M. I. (2003). Latent dirichlet allocation.
Journal of machine Learning research, 3: 993-1022.

Cheung, J. O., & Feng, D. (2019). Attitudinal meaning and social struggle in heavy metal song lyrics: a corpus-based analysis.
Social Semiotics: 1-18.

Czedik-Eysenberg, I., Reuter, C., & Knauf, D. (2018).
Decoding the sound of ‘hardness’ and ‘darkness’ as perceptual dimensions of music.
Book of Abstracts. Graz, pp. 112-13.

Herbst, J.-P. (2017). Historical development, sound aesthetics and production techniques of the distorted electric guitar in metal music.
Metal Music Studies, 3(1): 23-46.

Hu, X., & Downie, J. S. (2010). Improving mood classification in music digital libraries by combining lyrics and audio.
Proceedings of the 10th annual joint conference on Digital libraries. ACM, pp. 159-68.

Kim, Y. E., Schmidt, E. M., Migneco, R., Morton, B. G., Richardson, P., Scott, J.,
Speck, J.A.
Turnbull, D. (2010). Music emotion recognition: A state of the art review.
ISMIR, pp. 937-52.

Laurier, C., Grivolla, J., & Herrera, P. (2008). Multimodal music mood classification using audio and lyrics.
Seventh International Conference on Machine Learning and Applications. IEEE, pp. 688-93.

McVicar, M., Freeman, T., & De Bie, T. (2011). Mining the Correlation between Lyrical and Audio Features and the Emergence of Mood.
ISMIR, pp. 783-88.

Mynett, M. (2013).
Contemporary metal music production. Dissertation, University of Huddersfield.

Neumayer, R., & Rauber, A. (2007). Integration of text and audio features for genre classification in music information retrieval.
European Conference on Information Retrieval, pp. 724-27.

Nichols, E., Morris, D., Basu, S.,
Raphael, C. (2009). Relationships between lyrics and melody in popular music.
ISMIR, pp. 471–76.

Podoshen, J. S., Venkatesh, V., & Jin, Z. (2014). Theoretical reflections on dystopian consumer culture: Black metal.
Marketing Theory, 14(2): 207-27.

Reyes, I. (2008).
Sound, Technology, and interpretation in Subcultures of Heavy Music Production. Dissertation, Pittsburgh University.

Yu, Y., Tang, S., Raposo, F., & Chen, L. (2019). Deep cross-modal correlation learning for audio and lyrics in music retrieval.
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 15(1): 20-21.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.