Linked Data and Literature: Encoding the Facts in Fiction

Katherine Faith Lawrence

Authorship

1. Katherine Faith Lawrence

King's College London

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Summary
Is it possible to tell a story to a computer so that it can process the events that have happened? Can we computationally differentiate the Scotland of Tam Lin, Macbeth, Harry Potter and Brave while still acknowledging the shared concept of 'Scotland'? Or deal with Watson's wound being in both his leg and his shoulder in Conan Doyle's Sherlock Holmes? This workshop will provide a theoretical and practical introduction to the modelling of narrative elements for computational processing and look at the advantages and limitations of annotating stories in this way. Drawing on structuralist theory, philosophy, computer science and media studies this workshop is aimed at researchers working with narratives, especially fictional (or debateably fictional) narratives, and who are interested in how linked data techniques could open new possibilities for analysis and distant reading.

Workshop Description:
Prof. Hendler, one of the luminaries of the semantic web/linked data movement, illustrated the potential power of the semantic web by positing the following question as an example of a query that the technology should be able to answer: "what was that movie with the short henchman who decapitates a statue with his bowler hat?". This type of query is comparatively simple for a human to understand many will immediately be able to name the film (and more will be able to narrow it down as a Bond movie), but from the perspective of computational analysis the question it very complex. Students of comparative literature or mythology may dream of being able to search, if not for short men in dangerous bowler hats, then for stories where the world is created from someone's body parts (list by body part) or the moment when a spell is broken by a kiss. This workshop will enable attendees to take the first steps towards creating systems with will allow for this type of semantic querying.

From narrative structuralists to TVTropes, from the Bechdel-Wallace Test presenting a thinking point on representation to the role of specific events such as transformations, social situations or a climactic kiss stories give us a series of moments which are of interest both to researchers and to the wider world. The addition of computational techniques to the study of narratives has resulted in a breadth of distant reading which would have previously been beyond the realms of a single researcher. The linguistic analysis of the text is now relatively commonplace with easy to use tools available to extract statistical information about the composition. More advanced techniques bring the power of natural language processing, annotation, word stemming and synonyms into play to allow researchers the opportunity to reveal the structure of the telling of the story more efficiently than ever before. This collected and processed data can also drive subject indexes, making digital texts more accessible to the researcher than ever before. However the limitations of such techniques is that they work at a surface level and barely brush the semantics or structure of the story encoding of the story elements as linked data is one way to address this issue.

This workshop will focus on fictional narratives because they present numerous challenges beyond those shared with nonfictional narratives including the malleable nature of reality and how we can deal with the idea of truth within fiction. Since modelling is often seen as problematic because it is reductive in nature, the workshop will address the role of the model and the tension between the requirement to formularise to make the data computational and the inaccuracy and loss of information that is inherent in that process.

We will discuss the effect that levels of granularity and expectation have on model use in narrative study and the concept of the computer as an unreliable narrator.

Using examples and handson activities, this workshop will take attendees through the steps needed to extract and define story elements in an meaningful semantic way. While the exercises will focus on the OntoMedia ontology, other ontologies such as the Proppian Fairytale Markup Language (PftML) will be introduced and attendees who have already begun work in this areas will be encouraged to share their experiences and models. While the examples will be text based, the standoff nature of linked data allows us to apply the same techniques to fiction in many forms of media and, indeed, across multiple medium. Stories have existed and do exist in every format that humans have created from the earliest oral tradition to the latest Hollywood blockbuster, and everything in between. They also do not exist in a vacuum. Intertextuality is an important part of narrative so being able to link between stories expressed through different media is vital. The workshop will give attendees the opportunity to consider ways of dealing with, and linking between, story and character variants.

Intangible culture, such as that represented in fiction, is increasingly recognised as an important part of our heritage. In promoting ways for researchers to record, publish, share and analyse this data in open ways, this workshop encapsulates the conference theme of digital cultural empowerment. It also asks us to think about the way in which we classify media content and how the push to identify and filter by content may have unexpected repercussions on how, when and why we annotate narratives.

Attendees will be asked to bring laptops for use during the practical sessions but will have the option of working in pairs. Short texts will be provided for use in the practical sessions and it is not expected that attendees will need to install any software prior to the workshop. While the workshop will work with linked data and ontological models, the focus of the modelling discussion will be on the theoretical side and knowledge of OWL and other modelling languages will not be required. Some familiarity with XML and RDF may be helpful as attendees may be asked to work with source code but the workshop is intended as an introduction and no knowledge/ability beyond basic computer use will be assumed.

Presenter:

Dr K Faith Lawrence,

King's College London

faith.lawrence@kcl.ac.uk

Dr Lawrence is a Research Associate at the Department of Digital Humanities, King's College London where she works as a researcher and developer on a number of projects. Her research background centred around online communities, narrative and the semantic web. Her thesis, 'The Web of Community Trust Amateur Fiction Online: A Case Study in Community Focused Design for the Semantic Web', investigated usercentred design for emergent technologies through the case study of online fiction archives and author communities. This work focused on fan fiction communities, both in terms of how they currently interact with technology, and how that interaction may evolve in the future with the development of Web 2.0 and the semantic web. One important facet of of this work was an investigation into the description of narrative and content elements within textual, visual, aural and multimedia works. She is one of the cofounders of the OntoMedia ontology for describing narrative in heterogeneous media.

Target Audience: narratologists, literary scholars, historians, folklorists, media scholars, oral historians

Expected number of participants: 20 - 30

Outline of Content:

Morning:

Welcome and Introduction (30 mins)
Presentation: The Good, the Bad and the Ugly of Modelling Narrative Elements in Fiction (45 mins)
Group Activity: Annotating Little Red Riding Hood I Identifying the Narrative Elements (30 mins)
[Break]

Report back and Discussion (30 mins)
Presentation: And Then Something Happened Granularity and Defining Events (45 mins) Afternoon:
Group Discussion: Extracting Events Examples (30 mins)
Group Activity: Little Red Riding Hood II Typing the Elements (30 mins)
Presentation: Can we handle the truth? Variation, Intertextuality and Unreliable Narrative (45 mins)
[Break]

Group Activity: Little Red Riding Hood III Linking the Elements (30 mins)
Presentation: Exploring the data (30 mins)
Wrap up (15 mins)
Length: 1 Day

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014

"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO

Linked Data and Literature: Encoding the Facts in Fiction

1. Katherine Faith Lawrence

ADHO - 2014

"Digital Cultural Empowerment"