How Do You Visualize a Million Links?

paper
Authorship
  1. 1. Susan Brown

    University of Alberta, University of Guelph

  2. 2. Jeffery Antoniuk

    University of Alberta

  3. 3. Michael Bauer

    Computer Science - Western University (University of Western Ontario)

  4. 4. Jennifer Berberich

    Computer Science - Western University (University of Western Ontario)

  5. 5. Milena Radzikowska

    Communications - Mount Royal University (Mount Royal College)

  6. 6. Stan Ruecker

    Humanities Computing - University of Alberta

  7. 7. Terence Yung

    Communications - Mount Royal University (Mount Royal College)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

In the past quarter century, established methods
of literary history have been severely contested.
On the one hand, syncretic, single-author
histories have become problematic as a result
of a combination of the expanded literary canon
and a range of theoretical challenges. On the
other, a demand for historicized overviews that
reflect the radical recent reshaping in all fields
of literary study has produced large numbers
of both collectively written histories and
encyclopedias or companions. Literary history
thus tends towards compilations in which
specialists treat their particular fields, at the cost
of integration or of coherence. Meanwhile, the
primary materials are increasingly available in
digital form, and literary historical scholarship
itself is increasingly produced digitally, whether
as versions of established forms such as
journal articles, or in resources that invoke the
potential for new kinds of analysis. Major digital
initiatives over the past decades have focused
almost exclusively on digital resource creation:
the increasingly pressing question is how to use
this expanding body of materials to its fullest
potential.
In this project, we investigate how literary
historical analysis can be extended using various
forms of visualization, using the experimental
Orlando Project as our test bed.
Orlando:
Women's Writing in the British Isles from
the Beginnings to the Present
is recognized
as the most extensive and detailed resource
in its field and as a model for innovative
scholarly resources. Composed of 1,200 critical
biographies plus contextual and bibliographical
materials, it is extensively encoded using
an interpretive Extensible Markup Language
(XML) tagset with more than 250 tags for
everything from cultural influences, to relations
with publishers, or use of genre or dialect.
The content and the markup together provide
a unique representation of a complex set of
interrelations of people, texts, and contexts.
These interrelations and their development
through time are at the heart of literary inquiry,
and having those relations embedded in the
markup, and hence processable by computer,
offers the opportunity to develop new forms
of inquiry into, and representations of, literary
history. Such new opportunities of scale are
often invoked using Greg Crane’s seminal
question, “What can you do with a million
books?” (2006).
We need to be able to ask big, complex questions
while remaining grounded in particularities,
and we need new ways of representing
answers to those questions. This requires new
tools for scholarly research that can access,
investigate, and present new aspects of the
human story and history. In this context, we
contend that the scholarly interface requires
not only experimentation but also careful
assessment to see what works to make digital
materials of real value to humanities scholars.
As argued by Ramsay (2003), Unsworth
(2006), and others, using computers to do
literary research can contribute to hermeneutic
or interpretive inquiry. Digital humanities
research has inherited from computational
science a leaning towards systematic knowledge
representation. This has proved serviceable in

2
some humanities activities, such as editing, but
digital methods have far more to offer the
humanities than this. As Drucker and Nowviskie
have argued, “The computational processes that
serve speculative inquiry must be dynamic
and constitutive in their operation, not merely
procedural and mechanistic” (431).
The
Orlando
encoding system, devised for
digital rather than print textuality, facilitates
collaboratively-authored research structured
according to consistent principles. The encoding
creates a degree of cross-referencing and textual
inter-relation impossible with print scholarship
—not simply hyperlinking but relating separate
sections of scholarly text in ways unforeseen
even by the authors of the sections. It
represents a new approach to the integration
of scholarly discourse, one which allows
the integrating components to operate in
conjunction with, rather than in opposition
to, historical specificity and detail (Brown et
al. 2006c). However, the search-and-retrieval
model of the current interface for
Orlando
,
while user-friendly in that it resembles first-
generation online research tools, cannot exploit
this encoding to the fullest. Search interfaces
only find what the user asks for, whereas
visualization enables exploration and discovery
of patterns and relationships that one might not
be able to search for.
For instance, the current interface permits
users to search for authors by the number
of children they had, and thus to explore
the relationship between literary production
and reproduction. The quantity of material in
Orlando
makes it difficult to see overall patterns
amongst the results. Recent experiments with
the Mandala browser have demonstrated that
visualization permits one to see both interesting
anomalies (e.g. in lives which have demanded
the use both of the childlessness tag and
the children tag), and larger patterns, such
as the non-correlation between high literary
productivity and childlessness or small family
size (Brown et al. 2008). These preliminary
investigations confirm Moretti’s argument
(2005) that visual representations enable kinds
of literary historical inquiry that are not
supported by conventional search interfaces.
Orlando
has the added advantage of making it
possible to dive back into the source material to
see the specifics from which the representation
is produced.
In addition to the Mandala experiments, we
have also been working on a set of designs for
visually summarizing relationships in a manner
that allows interactive exploration (e.g. Fig.
1). Building on the large body of previous
literature in network visualization (e.g. Barabási
2002; Watts 2003; Christakis and Fowler
2009), we are experimenting with new visual
representations for networks of people.
Fig. 1: One of several concepts for summarizing relationships
among authors in Orlando. Here the authors on the path
of connection are shown as coloured circles, where size is
frequency and a unique colour is assigned to each author.
Interviews and observations of users at a
recent hackfest provided some excellent insights
into the sorts of interfaces that are likely to
appeal to users wanting to explore embedded
relationships in a body of texts in an open-
tended way. While point-to-point visualization
was considered to have some value, more
excitement was generated by open-ended
interface sketches, such as a visualization that
resembles a cityscape, even when these were
much less representational than conventional
interfaces for the humanities.
At the same time, Brown and Bauer have been
working on a visualization tool that illustrates
the challenges facing the project of representing
all of
Orlando
’s semantically interrelated data
through a graphical representation based on
nodes and edges. It highlights the difficulty of
providing prospect when dealing with a large
and complexly structured data set, since the
full set of relationships even of a moderate
subset of the 1200 writers becomes unreadable,
with over 16 million edges in the graph. We

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2010
"Cultural expression, old and new"

Hosted at King's College London

London, England, United Kingdom

July 7, 2010 - July 10, 2010

142 works by 295 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (still needs to be added)

Conference website: http://dh2010.cch.kcl.ac.uk/

Series: ADHO (5)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None