Collaborative tool-building with Pliny: a progress report

John Bradley

Authorship

1. John Bradley

King's College London

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

In the early days the Digital Humanities (DH) focused on
the development of tools to support the individual scholar
to perform original scholarship, and tools such as OCP and
TACT emerged that were aimed at the individual scholar. Very
little tool-building within the DH community is now aimed
generally at individual scholarship. There are, I think, two
reasons for this:
• First, the advent of the Graphical User Interface (GUI)
made tool building (in terms of software applications that
ran on the scholar’s own machine) very expensive. Indeed,
until recently, the technical demands it put upon developers
have been beyond the resources of most tool developers
within the Humanities.
• Second, the advent of the WWW has shifted the focus
of much of the DH community to the web. However, as a
result, tool building has mostly focused on not the doing of
scholarly research but on the publishing of resources that
represent the result of this.
DH’s tools to support the publishing of, say, primary sources,
are of course highly useful to the researcher when his/her
primary research interest is the preparation of a digital edition.
They are not directly useful to the researcher using digital
resources. The problem (discussed in detail in Bradley 2005)
is that a signifi cant amount of the potential of digital materials
to support individual research is lost in the representation in
the browser, even when based on AJAX or Web 2.0 practices.
The Pliny project (Pliny 2006-7) attempts to draw our attention
as tool builders back to the user of digital resources rather than
their creator, and is built on the assumption that the software
application, and not the browser, is perhaps the best platform
to give the user full benefi t of a digital resource. Pliny is not
the only project that recognises this. The remarkable project
Zotero (Zotero 2006-7) has developed an entire plugin to
provide a substantial new set of functions that the user can do
within their browser. Other tool builders have also recognised
that the browser restricts the kind of interaction with their
data too severely and have developed software applications
that are not based on the web browser (e.g. Xaira (2005-7),
WordHoard (2006-7), Juxta (2007), VLMA (2005-7)). Some of
these also interact with the Internet, but they do it in ways
outside of conventional browser capabilities.
Further to the issue of tool building is the wish within the
DH community to create tools that work well together.
This problem has often been described as one of modularity
– building separate components that, when put together, allow the user to combine them to accomplish a range of
things perhaps beyond the capability of each tool separately.
Furthermore, our community has a particularly powerful
example of modularity in Wilhelm Ott’s splendid TuStep
software (Ott 2000). TuStep is a toolkit containing a number
of separate small programs that each perform a rather abstract
function, but can be assembled in many ways to perform a
very large number of very different text processing tasks.
However, although TuStep is a substantial example of software
designed as a toolkit, the main discussion of modularity in the
DH (going back as far as the CETH meetings in the 1990s)
has been in terms of collaboration – fi nding ways to support
the development of tools by different developers that, in fact,
can co-operate. This is a very different issue from the one
TuStep models for us. There is as much or more design work
employed to create TuStep’s framework in which the separate
abstract components operate (the overall system) as there is
in the design of each component itself. This approach simply
does not apply when different groups are designing tools
semi-independently. What is really wanted is a world where
software tools such as WordHoard can be designed in ways
that allow other tools (such as Juxta) to interact in a GUI, on
the screen.
Why is this so diffi cult? Part of the problem is that traditional
software development focuses on a “stack” approach. Layers
of ever-more specifi c software are built on top of moregeneral
layers to create a specifi c application, and each layer in
the stack is aimed more precisely at the ultimate functions the
application was meant to provide. In the end each application
runs in a separate window on the user’s screen and is focused
specifi cally and exclusively on the functions the software was
meant to do. Although software could be written to support
interaction between different applications, it is in practice still
rarely considered, and is diffi cult to achieve.
Pliny, then, is about two issues:
• First, Pliny focuses on digital annotation and note-taking
in humanities scholarship, and shows how they can be used
facilitate the development of an interpretation. This has
been described in previous papers and is not presented
here.
• Second, Pliny models how one could be building GUIoriented
software applications that, although developed
separately, support a richer set of interactions and
integration on the screen.
This presentation focuses primarily on this second theme,
and is a continuation of the issues raised at last year’s poster
session on this subject for the DH2007 conference (Bradley
2007). It arises from a consideration of Pliny’s fi rst issue
since note-taking is by its very nature an integrative activity
– bringing together materials created in the context of a large
range of resources and kinds of resources.
Instead of the “stack” model of software design, Pliny is
constructed on top of the Eclipse framework (Eclipse 2005-
7), and uses its contribution model based on Eclipse’s plugin
approach (see a description of it in Birsan 2005). This approach
promotes effective collaborative, yet independent, tool
building and makes possible many different kinds of interaction
between separately written applications. Annotation provides
an excellent motivation for this. A user may wish to annotate
something in, say, WordHoard. Later, this annotation will need
to be shown with annotations attached to other objects from
other pieces of software. If the traditional “stack” approach
to software is applied, each application would build their own
annotation component inside their software, and the user
would not be able to bring notes from different tools together.
Instead of writing separate little annotation components inside
each application, Eclipse allows objects from one application to
participate as “fi rst-class” objects in the operation of another.
Annotations belong simultaneously to the application in which
they were created, and to Pliny’s annotation-note-taking
management system.
Pliny’s plugins both support the manipulation of annotations
while simultaneously allowing other (properly constructed)
applications to create and display annotations that Pliny
manages for them. Furthermore, Pliny is able to recognise
and represent references to materials in other applications
within its own displays. See Figures I and II for examples of
this, in conjunction with the prototype VLMA (2005-7) plugin
I created from the standalone application produced by the
VLMA development team. In Figure I most of the screen
is managed by the VLMA application, but Pliny annotations
have been introduced and combined with VLMA materials.
Similarly, in fi gure II, most of the screen is managed by Pliny
and its various annotation tools, but I have labelled areas on
the screen where aspects of the VLMA application still show
through.
Figure I: Pliny annotations in a VLMA viewer Figure II: VLMA objects in a Pliny context
This connecting of annotation to a digital object rather than
merely to its display presents some new issues. What, for
example, does it mean to link an annotation to a line of a
KWIC display – should that annotation appear when the same
KWIC display line appears in a different context generated as
the result of a different query? Should it appear attached to
the particular word token when the document it contains is
displayed? If an annotation is attached to a headword, should it
be displayed automatically in a different context when its word
occurrences are displayed, or only in the context in which the
headword itself is displayed? These are the kind of questions of
annotation and context that can only really be explored in an
integrated environment such as the one described here, and
some of the discussion in this presentation will come from
prototypes built to work with the RDF data application VLMA,
with the beginnings of a TACT-like text analysis tool, and on
a tool based on Google maps that allows one to annotate a
map.
Building our tools in contexts such as Pliny’s that allow for a
complex interaction between components results in a much
richer, and more appropriate, experience for our digital user.
For the fi rst time, perhaps, s/he will be able to experience
the kind of interaction between the materials that are made
available through applications that expose, rather than hide, the
true potential of digital objects. Pliny provides a framework
in which objects from different tools are brought into close
proximity and connected by the paradigm of annotation.
Perhaps there are also paradigms other than annotation that
are equally interesting for object linking?
References
Birsan, Dorian (2005). “On Plug-ins and Extensible
Architectures”, In Queue (ACM), Vol 3 No 2.
Bradley, John (2005). “What you (fore)see is what you get:
Thinking about usage paradigms for computer assisted text
analysis” in Text Technology Vol. 14 No 2. pp 1-19. Online at
http://texttechnology.mcmaster.ca/pdf/vol14_2/bradley14-
2.pdf (Accessed November 2007).
Bradley, John (2007). “Making a contribution: modularity,
integration and collaboration between tools in Pliny”. In
book of abstracts for the DH2007 conference, Urbana-
Champaign, Illinois, June 2007. Online copy available at http://
www.digitalhumanities.org/dh2007/abstracts/xhtml.xq?id=143
(Accessed October 2007).
Eclipse 2005-7. Eclipse Project Website. At http://www.eclipse.
org/ (Accessed October 2007).
Juxta (2007). Project web page at http://www.nines.org/tools/
juxta.html (accessed November 2007).
Ott, Wilhelm (2000). “Strategies and Tools for Textual
Scholarship: the Tübingen System of Text Processing
Programs (TUSTEP)” in Literary & Linguistic Computing, 15:1
pp. 93-108.
Pliny 2006-7. Pliny Project Website . At http://pliny.cch.kcl.ac.uk
(Accessed October 2007).
VLMA (2005-7). VLMA: A Virtual Lightbox for Museums and
Archives. Website at http://lkws1.rdg.ac.uk/vlma/ (Accessed
October 2007).
WordHoard (2006-7). WordHoard: An application for close
reading and scholarly analysis of deeply tagged texts. Website
at http://wordhoard.northwestern.edu/userman/index.html
(Accessed November 2007).
Xaira (2005-7). Xaira Page. Website at http://www.oucs.ox.ac.
uk/rts/xaira/ (Accessed October 2007).
Zotero (2006-7). Zotero: A personal research assistant. Website
at http://www.zotero.org/ (Accessed September 2007).

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2008

Hosted at University of Oulu

Oulu, Finland

June 25, 2008 - June 29, 2008

135 works by 231 authors indexed

Conference website: http://www.ekl.oulu.fi/dh2008/

Series: ADHO (3)

Organizers: ADHO

Collaborative tool-building with Pliny: a progress report

1. John Bradley

ADHO - 2008