"The Mold Thats Branded On M Soul": A Computational Approach to Racialized Voice in Jean Toomer's "Kabnis"

paper, specified "long paper"
  1. 1. Jonathan Dick

    University of Toronto

  2. 2. Adam Hammond

    University of Toronto

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.


Scholars of the Harlem Renaissance have long struggled with the generic status of Jean Toomer’s
Cane, particularly as a result of the hybrid forms found in the forty-nine-page closing story, “Kabnis.” In order to narrow the possible field of inquiry, critics have opted to read
Cane as an experiment in autobiography, one that either avows or disavows African American identity (Gunther Kodat, 2000: 1). What strikes us about such criticism is the way these readings of “Kabnis” inadvertently (or, perhaps, advertently) insist on indexing the text along binary oppositions of black versus white despite “Kabnis” itself resisting such easy racial classification: not only is its eponymous character described in mixed racial epithets (metaphors, for instance, like “lemon face” [
Cane, 2011: 111]), but his speech, while occasionally dialectal, toes the line between an African-American Vernacular English (AAVE) register and the register of a standardized English. With the extent of Kabnis’s racialization therefore dependent on how a reader interprets his features and his voice, literature scholars need to establish new methods through which to analyze this text’s phonemical registers.


Computational analysis provides us with one method to measure the reception of race in Toomer’s work. In particular, literature scholars can turn to the developing field of sound studies, using visualization tools such as SonicVisualizer (Cannam et. al, 2010) alongside quantitative software like Gentle and Drift (Ochshorn and Hawkins, 2019; Ochshorn, 2019) to ascertain how readers embody or avoid “doing” what scholars have variously called “black voice” in performances of “Kabnis” (Holmes, 2004). SonicVisualiser was developed at the University of London and provides musicologists with a flexible apparatus in which to annotate, track, and edit audio recordings. Gentle, a forced aligner, and Drift, a pitch trace exploration tool, are under development by Robert Ochshorn with support from Marit MacArthur’s ACLS Digital Innovation Fellowship; taken together, they cultivate nonsemantic data that, when run through a Matlab script, can calculate prosodic measures that reflect the salient features of performative speech (MacArthur et. al). Though neither tool can directly map dialect, what they can do is analyze the nonsemantic aspects of speech that AAVE adopts by virtue of its phonology—aspects of speech such as irregular metrical patterns, higher pitch ranges, and a greater overall tonal expressivity (Rickford, 1999:5). Applying them to recordings of “Kabnis” then presents critics with the opportunity to witness first hand not only
what a reader might modify when approaching Toomer’s language, but also
how they might modify it.

To demonstrate the efficacy of sonic analysis to literary studies, we use as our case study the 2013 audiobook of
Cane produced by Dreamscape Media and performed by Sean Crisden. Crisden, a self-identified African American, seems, on first listen, to embody a distinctly racial character. Yet as we progressed through his performance, it became clear that Crisden ultimately avoids the phonemical features distinct to AAVE, resulting in a portrayal of Kabnis that eludes clear raciality. It is difficult to describe the terms of this avoidance without qualitative measures. What prosodic features indicate the affectation of a racially charged voice? What prosodic features indicate the avoidance of a racially charged voice? How can variations in pitch and rhythm be used to advance arguments about identity performance? These ambiguous and deeply difficult questions can begin to be parsed through the literary application of sound studies software.

Results and discussion

To establish a baseline for our analysis, it is important to register subjective responses to Crisden’s performance. When we listened to the Dreamscape audiobook, we recognized that Crisden adopts two distinct registers for its characterization: the first is what might be identified as a “Northern” voice while the second is a “Southern” voice. These two voices are distinct on the page, with Toomer representing the former in a standardized English and the latter in an ambiguous vernacular. Yet Sean Crisden does not fully abide by this textual difference, opting, in his performance, to enunciate “Southern” Kabnis’s words in a standardized English register, even when certain letters are elided. The distinction he makes between the “Northern” and “Southern” voice therefore occurs instead at the level of the nonsemantic. For instance, in our subjective response to the audiobook, we noted Crisden’s “Southern” voice as more expressive than his “Northern” voice; it seems to have a wider pitch range, very few long pauses, and a conversational cadence which sees him playing more with assonance and internal rhyme (e.g. “mold thats branded on m soul” [
Cane, 2011: 151]). Likewise, his “Southern” voice exhibits “upspeak” tendencies, which is a trait many dialectologists associate with vernacular languages (Rickford, 1999: 5).

Qualifying these responses computationally, however, makes clear that such subjective claims are only partially true. As shown in Figure 1, although “Southern” Kabnis has a greater intonation range than “Northern” Kabnis, his expressivity metrics are quite low. While his average pause length of 0.30 seconds might indicate a conversational approach to diction, his words per minute and rhythmic complexity suggest an attention to formality and enunciation at odds with the varied metrical patterns inherent in AAVE, AAVE being characterized by phoneme cluster reduction: certain sounds, most particularly consonants, are dropped from the ends or beginnings of words, resulting in a vocalization in which sentence components are elided (e.g. “telling y about” would be pronounced
telling y’about). If Crisden abided by this phoneme cluster reduction rule, then his performance would have a higher rhythmic complexity factor overall, as these elisions would result in an unpredictable cadence. Likewise, his WPM would also be higher, since phoneme cluster reduction necessitates quicker speech. While we might therefore subjectively interpret Crisden’s performance of “Southern” Kabnis as demonstrating the phonological features of AAVE, computational analysis reveals that the opposite is in fact true: Crisden, at the nonsemantic level, avoids the phonology of black English by regulating his WPM and by enunciating his words despite their being written without certain end sounds.

We can further locate this de-vernacularization at the metrical level by turning to visualization software like SonicVisualizer. Figure 2 demonstrates a wave-form annotation of stress patterns found in brief excerpts of speech from “Northern” Kabnis and “Southern” Kabnis respectively, whereby orange lines indicate unstressed syllables, and red lines indicate stressed syllables. In both cases, Crisden alters the metric pattern inherent in Toomer’s writing to iambicize it, thereby emphasizing syllabic regularity despite the text’s very irregular rhythms. A scansion of the former excerpt as it exists in
Cane would, for instance, view “bull-neck” as a spondaic foot and “and a” as pyrrhic; Crisden, however, alters this rhythm by reading “neck” as an unstressed syllable and “and” as a stressed one, giving way to a cadence that is ordered and controlled. He continues this same metric regulation in the second excerpt by placing a stress on “on,” which leads, in turn, to a line that is forcefully iambicized despite its initial two trochaic substitutions. As with his previous modulation of WPM and articulatory patterns, the consequence of Crisden’s metrical regulation is a sort of phonological standardization, a taming which forces “Southern” Kabnis to speak in no distinctly vernacular fashion. The result is a negotiation of Kabnis’s identity that de-vernacularizes the racial potential encoded into his speech. And what are the implications of such a choice?

Prosodic analyses of three excerpts of "Kabnis" through Gentle and Drift: one passage which is distinctly “Northern,” one that is distinctly “Southern,” and one in a neutral Narrator’s voice. Miller’s script calculates twelve prosodic measures, though only six are salient for our purposes; these six measures are listed in the top row. All numbers are rounded to the nearest thousandth for concision’s sake.

Excerpt of wave-form visualizations from Kabnis's speeches, the former from act 1, the latter from act 5. This excerpt has been annotated through SonicVisualizer to demonstrate Crisden’s stress patterns.


Though it is impossible, of course, to claim with any authority Crisden’s intentions, the effect of his de-vernacularization reminds scholars of the value judgements that are placed on languages and on the people who speak them. For “there really is little if anything that […] distinguishes racial prejudice and linguistic prejudice,” writes sociolinguist Sonja L. Lanehart (2001: 2): “The people are not separate from the language.” Exporting the sentiment to “Kabnis,” how individuals choose to interpret Toomer’s eponymous character—and how we, as scholars, choose to interpret portrayals
of this character—therefore depends on how they view the construct of race as it operates not only within the early 20
th-century, but also during the period of interpretation. Perhaps, by rendering “Kabnis” without certain vernacular traits, Crisden is demonstrating a sort of racial prejudice that manifests, covertly, in his linguistic prejudice. Or perhaps, instead, his de-vernacularization is an attempt at equalization—an attempt, in other words, to reduce stigma by demonstrating that the people society often views as deviant are more similar than might once have been thought. That in our subjective analyses, we saw in the “Northern” and “Southern” voice a distinct, dialectal difference further bespeaks a culturally codified perception of racial speech: that African-American Vernacular is socially thought to “sound” a certain way, compared with “standard” Englishes, these “sounds” made clear through Marit et. al’s prosodic measures. Whatever angle a literary critic chooses to take in their argument, computational analysis provides then with both the necessary quantitative data to analyze text-in-performance, and a means of reflexively considering their own stance on linguistic difference


Cannam, Chris, Christian Landone, and Mark Sandler. (2010).
Sonic Visualizer: An Open Source Application for Viewing, Analyzing, and Annotating Music Audio Files. Developed at the Centre for Digital Music, Queen Mary, University of London.

Holmes, David G. (2004).
Revisiting Racialized Voice: African American Ethos in Language and Literature. Southern Illinois University Press.

Kodat, Catherine Gunther. (2000). “To ‘Flash White Light from Ebony’: The Problem of Modernism in Jean Toomer’s Cane.” In
Twentieth Century Literature, vol. 46, no. 1: 1-19.

Lanehart, Sonja L. (2001) “State of the art in AAE Research.” In
Sociocultural and Historical Contexts of African American English, ed. Sonja L. Lanehart. John Benjamins: 1-20.

Ochshorn, Robert. (2019).
Drift: A Pitch Tracking Algorithm.
. Drift is currently under development with support from Marit MacArthur’s ACLS Digital Innovation Fellowship in 2015-16 and a NEH Digital Humanities Advancement grant in 2018 and 2019.

Ochshorn, Robert, and Max Hawkins. (2019)
Gentle: A Robust Yet Lenient Forced Aligner Built on Kaldi.
http://lowerquality.com/gentle/. Gentle is currently under development with support from Marit MacArthur’s ACLS Digital Innovation Fellowship in 2015-16 and a NEH Digital Humanities Advancement grant in 2018 and 2019.

Rickford, John R. (1999)
African American Vernacular English: Features, Evolution, Educational Implications. Blackwell.

Toomer, Jean. (2011).
Cane. Boni & Liveright.

Toomer, Jean. (2013).
Cane. Narrated by Sean Crisden, Dreamscape Media. Audiobook.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ADHO - 2019

Hosted at Utrecht University

Utrecht, Netherlands

July 9, 2019 - July 12, 2019

436 works by 1162 authors indexed

Series: ADHO (14)

Organizers: ADHO