Utrecht University
Meertens Instituut - Royal Netherlands Academy of Arts and Sciences (KNAW)
Utrecht University
This study explores the relation between repetition and popularity in Dutch historical songs. Previous studies on song lyrics have shown that contemporary songs stand a greater chance of reaching #1 of the
Billboard’s hit chart when their lyrics are more repetitive (Nunes, Ordanini, and Valsesia 2015; Alexander 1996; Ellis et al. 2015; Bradlow and Fader 2001). This preference for repetitive structures is a well-known cognitive bias (see e.g. Rubin 1995), yet little is known about whether similar preferences were at play in
historical popular song lyrics. The current study aims to address this question by quantitatively modelling the relationship between popularity and various forms of repetition in the lyrics of a large-scale collection of historical songs and, subsequently, relating our findings to observations in the modern era. While we acknowledge the effect of
musical repetition on a song’s popularity and the way this can affect our results, we focus in this study on
textual repetition.
As our object of scrutiny, we investigate a large sample of historical song lyrics from the
Dutch Song Database (Nederlandse Liederenbank) hosted at the KNAW Meertens Institute. This database contains data of over 175.000 descriptions of Dutch songs, from the Middle ages up to the twentieth century. We focus on material from 1550-1750 – since the 17th century was characterized by Grijp (1991, 29) as the golden age of the Dutch Song – and analyze a sample of approximately 22k song lyrics and available metadata. All songs have been encoded with TEI compliant XML, which provides both metadata (e.g. publication date, geographical location, melody, classification category) and the actual lyrics of a song.
Investigating the interaction between popularity and repetition in historical songs poses two important challenges. The first challenge is to establish a ranking of songs reflecting their contemporary popularity: after all, what is the historical equivalent of the modern Billboard’s hit chart? We solve this problem by approximating an early modern hit chart, in which the popularity of a historical song is defined as the interaction of several variables that affect the popularity of a song. Inspired by studies of Farmer and Lesser (2005a, 2005b, 2013) on the ‘structure of popularity’ of early modern print sources, we define the popularity of a song as the interaction of
(i) the number of reprints of a song in a fixed time period (measured with Gries’ dispersion method
DP (Gries 2008)) and
(ii) the geographical distribution of places of print of a song, as a reflection of the either local or wide-spread popularity of a song (Farmer and Lesser 2005a, 2005b, 2013).
The second challenge involves measuring repetition. Repetition in text can be measured on various dimensions, such as words, lines, letters, consecutive onsets and n-grams. In this study, we quantitatively estimate a song’s degree of repetitiveness using a variety of information-theoretical measures. More specifically, we employ different methods of text compression, such as
(i) the Shannon Entropy and
(ii) the Lempel-Ziv-Welch-algorithm (LZW). Drawing inspiration from prior work by Alexander (1996), we measure repetitiveness of words using the Shannon Entropy, which estimates the degree of uniformity in a message, and can be expressed as follows:
where
p represents the relative frequency of item
i in collection
A with
n items. To control for differences in document length, we use the normalized entropy (Yang et al. 2016):
The second compression method is the LZW algorithm (Welch 1984), which, put simply, incrementally encodes 8-bit data (e.g. ASCII characters) as fixed-length 12-bit codes. We compute the LZW score for song fragments as the number of characters in fragment x divided by the number of codes used to encode x.
Using the above-described compression methods as predictors, we model the relationship between popularity and repetition with regression models. Being at the intersection of several disciplines, the current study aims to contribute to literary and computational research, but also gives insight in the appreciation of the human brain for repetitive and nonrepetitive patterns.
Bibliography
Alexander, Peter J. 1996. ‘Entropy and Popular Culture: Product Diversity in the Popular Music Recording Industry’. American Sociological Review 61 (1): 171–74.
Bradlow, Eric T., and Peter S. Fader. 2001. ‘A Bayesian Lifetime Model for the “Hot 100” Billboard Songs’. Journal of the American Statistical Association 96 (454): 368–81.
Ellis, Robert J, Zhe Xing, Jiakun Fang, and Ye Wang. 2015. ‘Quantifying Lexical Novelty in Song Lyrics’. ISMIR, 694–700.
Farmer, Alan B., and Zachary Lesser. 2005a. ‘Structures of Popularity in the Early Modern Book Trade’. Shakespeare Quarterly 56 (2): 206–13.
———. 2005b. ‘The Popularity of Playbooks Revisited’. Shakespeare Quarterly 56 (1): 1–32.
———. 2013. ‘What Is Print Popularity? A Map of the Elizabethan Book Trade’. In The Elizabethan Top Ten : Defining Print Popularity in Early Modern England, edited by Emma Smith and Andy Kesson, 19–54. Material Readings in Early Modern Culture. Surry, England: Routledge. http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=597438&site=ehost-live.
Gries, Stefan Th. 2008. ‘Dispersions and Adjusted Frequencies in Corpora’. International Journal of Corpus Linguistics 13 (4): 403–37. https://doi.org/10.1075/ijcl.13.4.02gri.
Nunes, Joseph C., Andrea Ordanini, and Francesca Valsesia. 2015. ‘The Power of Repetition: Repetitive Lyrics in a Song Increase Processing Fluency and Drive Market Success’. Journal of Consumer Psychology 25 (2): 187–99.
Rubin, David C. 1995. Memory in Oral Traditions: The Cognitive Psychology of Epic, Ballads, and Counting-out Rhymes. New York: Oxford University Press.
Welch, Terry A. 1984. ‘Technique for High-Performance Data Compression’. Computer, no. 52.
Yang, Yu-Guang, Peng Xu, Rui Yang, Yi-Hua Zhou, and Wei-Min Shi. 2016. ‘Quantum Hash Function and Its Application to Privacy Amplification in Quantum Key Distribution, Pseudo-Random Number Generation and Image Encryption’. Scientific Reports 6 (January): 19788.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
In review
Hosted at Utrecht University
Utrecht, Netherlands
July 9, 2019 - July 12, 2019
436 works by 1162 authors indexed
Conference website: http://staticweb.hum.uu.nl/dh2019/dh2019.adho.org/index.html
References: http://staticweb.hum.uu.nl/dh2019/dh2019.adho.org/programme/book-of-abstracts/index.html
Series: ADHO (14)
Organizers: ADHO