Access(ed) Poetry. The Graph Poem Project and the Place of Poetry in Digital Humanities

Chris Tanasescu (MARGENTO); Diana Inkpen; Vaibhav Kesarwani; Brian Paget

Authorship

1. Chris Tanasescu (MARGENTO)

Université d'Ottawa (University of Ottawa)
2. Diana Inkpen

Université d'Ottawa (University of Ottawa)
3. Vaibhav Kesarwani

Université d'Ottawa (University of Ottawa)
4. Brian Paget

Université d'Ottawa (University of Ottawa)

Original URL

https://dh2017.adho.org/abstracts/623/623.pdf

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Graph Poem Project at University of Ottawa develops tools for poetry computational analysis and applies graph theory and network graph computational apps in structuring, analyzing, and visualizing poetic corpora. The concept involves generating network graphs (multigraphs) in which the vertices are poems and the edges represent various commonalities between poems in terms of subject, diction, form, style, and other criteria. By computationally analyzing the network graph certain extremely interesting and potentially useful information can be extracted regarding a specific corpus. For instance, by identifying cut vertices, we find out which poem(s) play a crucial role in the connectivity of the network and therefore also in the cohesion of the corpus, since by removing that particular poem-node the whole network becomes disconnected, and thus the corpus per se without that particular poem would become disarticulate and divided; and similarly, cut edges signal connections between certain poems that are of paramount importance in the connectivity of the entire network. Identifying cliques of nodes, on the other hand, for instance, shows how poems within a corpus are clustered and which parts of a specific oeuvre, collection, or anthology are more self-contained than others or which poems belong together across or within divides such as authorship, school, period or region, etc. More generally, representing corpora as network multigraphs makes possible analyzing and visualizing both small and significantly large datasets (so far up to hundreds of thousands) of poems, and reach certain information and conclusions about both particular poems in that corpus and the corpus as a whole or as compared to other corpora.

This is one of the first respects in which the issue of access becomes of dramatic importance for our research and for developing and testing our tools; and virtually for any contemporary digital project in poetry. Namely the access to databases and the ability to convert existing files into formats that are compatible to computational processing and analysis. The paper will focus on the existing databases, will review previous work in the field and analyze the kind of data they have employed, will examine the premises and prospects for big data and/or data intensive research in poetry (one of the main tenets of our project), and the arguable absence of such approaches in the published research in the area so far. Terms like “crawl” and “rip” for the poetry available online, and issues related to digitization and/or computationally analyzing poetry in print will also be placed under close scrutiny while we will also report on our own solutions and results, and compare them to those in other projects in digital humanities (DH), digital literary studies, or computational linguistics/analysis.

Our research, tools, publications, and future work will therefore be then presented in a comparative manner in the wider context of current trends, theories, and debates in DH in general and text analysis in particular—by considering for instance the potential relevance of the Graph Poem Project to at least some of the issues raised in the “Forum: Text Analysis at Scale” section of Debates in the Digital Humanities 2016 (eds.

Lauren F. Klein and Matthew K. Gold), or literature-related tools/projects such as Syuzhet or heureCLEA—, in natural language processing (NLP)—by reviewing other projects that have dealt with literary computational analysis and particular poetry processing, but also other works that focused on various issues in computational linguistics, such as syntax or trope processing and analysis—, and in digital pedagogy, by relating it to other poetry related digital pedagogical projects (such as those reviewed by Chuck Rybak, for

instance, in Digital Pedagogy in the Humanities). “Access” therefore will turn out to encompass quite a few different meanings and phenomena. In terms of text and data analysis, we will review existing debates and commentaries and contribute our own on the politics of developing poetry archives or assembling collec-tions/anthologies, and the related issues of represen-tationality, especially with regards to gender, race, minority, and so forth. In other words, the question asked will be, what is the access of certain authors and their works to our research and tools, what are their chances to be represented in the graph poem? Then, in terms of existing tools, accessibility refers to either the monetary aspects of acquiring or upgrading certain apps or to accessing the source code and/or repurposing it, and the discussion will gravitate around the political and cultural implications of such options. Access then also means accessibility of our tools to the user, in the sense of how easy or how complicated it is for ‘anybody' to read our approach, use our apps, and employ our results; and why would they ever do it?— how are we trying to ‘seduce' our audience? The paper will hence examine what are the “compromises” we have made in that respect—as for instance in picking 2 or 3 dimensions as rhyme sub-types in employing a certain API for displaying rhyme scores as scatter graphs—, and what are the actual efforts we are making in conveying the potential significance of sometimes arcane mathematical theories and algorithms for poetry. But on yet another level, access means success, namely the measure of our project's access to levels of meaning (in Adrian Liu's terms and not only) that may be deemed as ‘deeply' or ‘typically' poetic, an analysis that will combine notions of genre or craft related subtlety with pragmatic computational concerns, such as operationally consolidating numeric and nonnumeric features and commonalities.

A more prosaic question to address will then be what is the access poetry really has to DH-related research, interest, and funding. Again, the data related to our own project will be compared to others, and the analysis will focus on the place of poetry in significant DH publications, periodicals, and platforms.

Full text license: CC BY 4.0

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2017

"Access/Accès"

Hosted at McGill University, Université de Montréal

Montréal, Canada

Aug. 8, 2017 - Aug. 11, 2017

438 works by 962 authors indexed

Conference website: https://dh2017.adho.org/

References: http://web.archive.org/web/20170802132745/https://www.conftool.pro/dh2017/sessions.php

Series: ADHO (12)

Organizers: ADHO

Access(ed) Poetry. The Graph Poem Project and the Place of Poetry in Digital Humanities

1. Chris Tanasescu (MARGENTO)

2. Diana Inkpen

3. Vaibhav Kesarwani

4. Brian Paget

ADHO - 2017

"Access/Accès"