Scanning Between the Lines: The Search for the Semantic Story

panel / roundtable
  1. 1. K. Faith Lawrence

    Royal Irish Academy

  2. 2. Paolo Battino

    Royal Irish Academy

  3. 3. Paul Rissen

    British Broadcasting Corporation (BBC)

  4. 4. Michael O. Jewell

    Goldsmiths - University of London

  5. 5. Tarcisio Lancioni

    University of Siena

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The panel will present three projects which
are exploring the use of metadata to describe
the narrative content of media. Computer-
assisted textual analysis is now a well
known and important facet of scholarly
investigation (Potter, 1991; Burrows, 2004;
Yang, 2005) however it relies heavily on
statistical approaches in which the computer
uses character matching to identify reoccurring
strings. Although pattern recognition for image
and audio search is growing more sophisticated
(Downie, 2009), the techniques for annotation
of multimedia are subject to the same
limitations as those for text in that they cannot
go beyond the shape or the waveform into
the meaning that those artifacts of expression
This limitation has been addressed in a
number of different ways, for example through
traditional categorisation and with the use of
keywords for theme and motif annotation. New
techniques using natural language processing
software such as GATE (Auvil, 2007), and IBM's
LanguageWare (1641 Depositions Project -
) have taken this
further allowing a deeper level of meaning to be
inferred from the text through basic entity and
relationship recognition. The use of ontologies
to support this annotation opens the way for
more precise search, retrieval and analysis using
the techniques developed in conjunction with
semantic and linked web architecture.
The three papers being presented in this panel
will address the application of these techniques
to both textual and audio-visual media and
consider annotation not just of the documents
themselves but of the ideas contained within
them, how this information might be presented
to the user to best effect.
The first paper in this panel by Dr Michael
O. Jewell, Goldsmiths College, University of
London, focuses on the annotation of narrative
in scripts and screenplays. This paper presents
the combination of TEI and RDF annotations
as a methodology for opening the encoded
data up for inference-enhanced exploration and
augmentation through linked-data resources.
The second paper from Paul Rissen, BBC, and
Dr K Faith Lawrence, Royal Irish Academy,
presents work being done at the BBC in
the annotation of the narratives within their
audio/visual archives. This paper discusses the
initiatives within the BBC to make their content
more accessible and to allow more personal
interaction with the material. Through the use
of ontology, the events contained the media
object are exposed to exploration, analysis and
The final paper by Paolo Battino, Royal Irish
Academy, continues the visualisation theme to
discuss how narrative annotations might be
presented to assist in analysis of texts. Using
the example of folktales, this paper considers
the graphical representation of plotlines and
the possible issues and challenges inherent
for visualisation in moving from syntactic to
semantic description.
Auvil, L., Grois, E., Lloràname, X., Pape,
G., Goren, V., Sanders, B., Acs, B.,
McGrath, R. E.
(2007). 'A Flexible System
for Text Analysis with Semantic Networks'.
Proceedings of Digital Humanities 2007

Burrows, J.
(2004). 'Textual Analysis'.
A Companion to Digital Humanities.
Schriebman, S., Siemens, R., Unsworth, J.
(eds.). Oxford: Blackwell Publishing Ltd.
Downie, J. S., Byrd, D., Crawford, T.
(2009). 'Ten Years Of Ismir: Reflections
On Challenges And Opportunities'.
International Society for Music Information
Retrieval Conference
Potter, R. G.
(1991). 'Statistical Analysis of
Literature: A Retrospective on Computers and
the Humanities, 1966–1990'.
Computers and
the Humanities.
: 401-429.
Yang, H-C., Lee. H-C.
(2005). 'Automatic
Category Theme Identification and Hierarchy
Generation for Chinese Text Categorization'.
Journal of Intelligent Information Systems.
Semantic Screenplays:
Preparing TEI for Linked
Jewell, Michael O.
Goldsmiths College, University of London
Scripts, whether for radio plays, theatre,
or film, are a rich source of data. As
well as cast information and dialogue, they
may include performance directions, locations,
camera motions, sound effects, captions, or
entrances and exits. The TEI Performance Texts
module (
) provides a
means to encode this information into an
existing screenplay, together with more specific
textual information such as metrical details.
Meanwhile, Linked Data has become a major
component of the Semantic Web. This is a set
of best practices for publishing and connecting
structured data on the Web, which has led to
the creation of a global data space containing
billions of assertions, known as the Web of Data
(Bizer et al, 2009). Some of the most prominent
datasets in this space include DBpedia, with
more than 100 million assertions relating to
(amongst others) people, places, and films;
LMDB (Linked Movie Database), with over three
million filmic assertions; and LinkedGeoData,
which has almost two billion geographical
In this paper, we propose a means to support
Linked Data in TEI, thus benefitting from the
wealth of information available on top of that
which is provided by TEI. We describe the
augmentation of TEI documents with RDFa
(Resource Description Format in Attributes)
to complement the annotated content with
URIs and class information, and thence the
transformation of this document into triples
using our open source tei2onto conversion tool.
Finally, we provide some case studies that make
use of the resultant triples, and show how
their compliance with the OntoMedia ontologies
(Lawrence et al, 2006) allows for powerful
research possibilities.
1. Annoting TEI
1.1. Cast Lists
The first, and simplest, step to adding RDFa
attributes to a TEI document begins with the
cast list. Listing 1 shows a simple example
of a castItem element for the role of Jeffrey
Beaumont in Blue Velvet, portrayed by Kyle
MacLachlan. The about attribute specifies the
object to which the element relates: the actor
element refers to the DBPedia entry for Kyle
MacLachlan, while the role refers to an object
residing within the Blue Velvet namespace,
created specifically for this screenplay. The
attribute defines the predicate that
relates the content of the element to the object
- in this case, it is the name of the actor
or character. When processed by
actors are specified as
objects, which are
subclasses of the Friend of a Friend (FOAF)
ontology’s Agent class (
), and roles as Character objects.
The conversion script then analyses
for who attributes. These refer to the
attributes in the role elements, and thus it is
possible to determine the cast present in a
scene and the entity speaking a line. The former
may be found via the involves predicates of the
event, while the latter is represented with the
predicate. An OntoMedia
Social event is created for each element, with the
attributes describing

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2010
"Cultural expression, old and new"

Hosted at King's College London

London, England, United Kingdom

July 7, 2010 - July 10, 2010

142 works by 295 authors indexed

XML available from (still needs to be added)

Conference website:

Series: ADHO (5)

Organizers: ADHO

  • Keywords: None
  • Language: English
  • Topics: None