Fifty Shades of Twilight: Computationally Comparing Collocations in Twilight and 50 Shades of Grey

paper, specified "short paper"
  1. 1. Barbara Bordalejo

    University of Lethbridge

  2. 2. Joris J. Van Zundert

    Huygens Institute for the History of the Netherlands (Huygens ING) - Royal Netherlands Academy of Arts and Sciences (KNAW)

  3. 3. Julia Neugarten

    Huygens Institute for the History of the Netherlands (Huygens ING) - Royal Netherlands Academy of Arts and Sciences (KNAW)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

We present the results of measuring collocation similarity between
Twilight (Meyer 2005) and
50 Shades of Grey (James 2011)
. 50 Shades began as
Twilight-fanfiction (Brennan and Large 2014). We use these texts for a case study analyzing the transformative effects of fanfiction on the narratives that fans call “canon”. Tosenberger (2014:17) asserts that “fanfiction is given life by what other spaces don’t allow, it […] fills those spaces with stories for which the canon has neither room nor desire.” Fanfiction is a narrative space to explore non-normative topics and perspectives..
Twilight narrates the romance of a teenage girl and her vampire boyfriend.
50 Shades amplifies the mostly unconsummated sexual tension in
Twilight and eliminates the novel’s supernatural elements. In
50 Shades, the male protagonist is dangerous not because he is a vampire, but because of his S/M inclinations. Our challenge is to model and quantify these transformations computationally. 

Paris (2016) classifies
50 Shades as “mommy porn” while
Twilight has been called “abstinence porn” (Seifert 2005). In its vocabulary and collocations,
Twilight seems like a non-explicit model for
50 Shades. To test this, we make an educated guess by initially selecting four terms: “soft”, “hard”, “gaze”, and “stare”. We hypothesize that words collocating with “soft” and “hard” differ between texts: in
Twilight, Edward’s skin is hard, while in
50 Shades Christian’s penis is hard. Additionally, subjects and objects of stares and gazes differ between texts, with looks conveying love or longing in
Twilight while conveying sexual desire in
50 Shades. For each appearance of these terms we compute the pointwise mutual information (PMI) for collocated words in a 9-token context. PMI expresses the probability of a collocation occurring given the occurrence of the individual words (Bouma 2009:3). Window size was based on mean sentence length. A baseline for comparison was computed from the same measure for the YA-novels
Eleanor & Park (Rowell 2012),
The Fault in Our Stars (Green 2012), and
Shiver (Stiefvater 2009). We used Linguistic Inquiry & Word Count (Pennebaker et al. 2015) to calculate the percentage of words related to a specific domain as defined by LIWC’s dictionaries within the PMI-results. We then compared the percentage of words associated with the selected terms between the books and compared the LIWC-results for the PMI-data to the LIWC-results for the books as a whole.

Analyzing tokens identified by PMI as significantly collocated with the target words, more words from the LIWC-category “perception” occur around the term “hard” in
50 Shades than in
Twilight (9% vs. 7%).
Twilight shows more perceptions-terms around “soft” (12% vs. 15%). Thus, perceptions are more frequently described as “hard” in
50 Shades and more frequently as “soft” in
Twilight. In
Twilight more verbs occurred around “soft” (20%) than in
50 Shades, where 13% of significantly collocated tokens for “soft” were verbs. This suggests that more “soft” actions are taken in
Twilight than in
50 Shades, which would fit our hypothesis. In
50 Shades, the word “stare” more frequently occurred near words relating to biological features or processes (9%) than in
Twilight (3%). Similarly, “gaze” occurred around words relating to biological processes in
50 Shades (7%
) and only 5% in
Twilight. It thus appears that biological processes and parts of the body are often looking and being looked at in both texts, but more frequently in
50 Shades.

Our analysis confirms that
Twilight is non-explicit: it scores 0,01% in LIWC’s sexuality- and swearing-categories. In LIWC-categories relating to the social and to perception, the texts’ score similarly. Our results seem to confirm the hypothesis that
Twilight can be regarded as the non-explicit counterpart to
50 Shades. As a next step, we intend to examine the difference in gender-related words in the texts: 4% of words for
Twilight and 5% for
50 Shades were male-related, with only 1% female-related words in both
. Intuitively this makes sense as both are first-person narratives by female narrators focused on their male love interests
. However, male-related words were less frequent in the PMI-results for the selected terms than in the texts as a whole. 

Combining PMI and LIWC-results, we developed a method to compare collocations of specific words between texts. This method is a step towards digital hermeneutics, the possibility of “interpreting with digital machines” (Romele, Severo, and Furia 2020:73). During our presentation, we will present more detailed results, baseline comparisons, and will consider possibilities to improve their evaluation and discuss possible next steps such as analysis of word embeddings.


Bouma, G. (2009). Normalized (Pointwise) Mutual Information in Collocation Extraction.
Von Der Form Zur Bedeutung: Texte Automatisch Verarbeiten / From Form to Meaning: Processing Texts Automatically: Proceedings of the Biennial GSCL Conference 2009. Tübingen: Gunter Narr Verlag, pp. 31–39.

Brennan, J. and Large, D. (2014). ‘Let’s get a bit of context’: ‘Fifty shades’ and the phenomenon of ‘pulling to publish’ in ‘twilight’ fan fiction.
Media International Australia, Incorporating Culture & Policy(152). Media International Australia, Incorporating Culture & Policy: 27–39 doi:
10.3316/informit.567849615699510. (accessed 9 December 2021).

Green, J. (2012).
The Fault in Our Stars. 1st ed. New York: Durron Books.

James, E. L. (2011).
Fifty Shades of Grey. 1st ed. Waxahachie: The Writer’s Coffee Shop.

Meyer, S. (2005).
Twilight. London: Atom, Little, Brown Book group.

Paris, L. (2016). Fifty Shades of Fandom: The Intergenerational Permeability of Twilight Fan Culture.
Feminist Media Studies,
16(4): 678–92 doi:
10.1080/14680777.2016.1193297. (accessed 9 December 2021).

Pennebaker, J. W., Boyd, R. L., Jordan, K. and Blackburn, K. (2015). The development and psychometric properties of LIWC2015 University of Texas at Austin (accessed 9 December 2021).

Romele, A., Severo, M. and Furia, P. (2020). Digital hermeneutics: from interpreting with machines to interpretational machines.
35(1): 73–86 doi:
10.1007/s00146-018-0856-2. (accessed 9 December 2021).

Rowell, R. (2012).
Eleanor & Park. London: Orion Books.

Seifert, C. (2008). Bite Me! (Or Don’t): ‘Twilight’ Has Created a New YA Genre: Abstinence Porn.
Bitch Media (accessed 9 December 2021).

Stiefvater, M. (2009).
Shiver. New York: Scholastic Press.

Tosenberger, C. (2014). Mature Poets Steal: Children’s Literature and the Unpublishability of Fanfiction.
Children’s Literature Association Quarterly,
39(1): 4–27 doi:
10.1353/chq.2014.0010. (accessed 9 December 2021).

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2022
"Responding to Asian Diversity"

Tokyo, Japan

July 25, 2022 - July 29, 2022

361 works by 945 authors indexed

Held in Tokyo and remote (hybrid) on account of COVID-19

Conference website:

Contributors: Scott B. Weingart, James Cummings

Series: ADHO (16)

Organizers: ADHO