Homer Multitext - Nine Year Update

C.W. Blackwell; D.N. Smith

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

In 2000, Gregory Nagy, in a review of Martin West’s
1998 edition of the Homeric Iliad, contrasted the traditional
critical editions of West and others with the ancient
diorthosis (a word translated, perhaps loosely, as
“edition”) of the Alexandrian scholar Aristarchus: “I
submit that Aristarchus’ ancient edition of the Iliad, if it
had survived in its original format, would in many ways
surpass West’s present edition. It would be a more useful—
and more accurate—way to contemplate the Iliad
in its full multiformity.”
This assertion was the origin of the Homer Multitext, an
effort to bring together a comprehensive record of the
Homeric tradition in a digital library. This paper will describe
the state of this project as it approaches the end
of its first decade; this paper is not a progress report, but
a description of an infrastructure that for both technological
and what we might call semantic integration. In
both its collection of data and its development of tools
and infrastructure, the HMT has focused not on building
a single-purpose application to support a particular
theoretical approach, but on defining a long-term generic
digital library expressly intended to encourage reuse of
its contents, services, and tools.
Homeic scholarship in the 18th and 19th centuries was
firmly based on the texts that survived through a manuscript
tradition, most notably the great Byzantine codices
of the Iliad and its ancient commentaries, the Homeric
scholia. The 20th century saw the increasing recovery
and publication of even older fragments on papyrus, but
otherwise moved from a manuscript-based scholarship
toward the scholarship of the critical edition. So, for example,
the 1870s and 1880s saw the publication by W.
Dindorf and E. Maas of editions of scholia organized
by manuscript, the A and B manuscripts from Venice,
and the T manuscript from London; in 1901 D. Comparetti
edited a photo-facsimile edition of the A manuscript
from Venice (part of an ambitious series of facsimile editions
that was abandoned unfortunately incomplete due
to lack of interest and funding). The 20th century saw the
publication of a critical edition of the Iliad by T.W. Allen
in the 1930s, and another by M. West in the 1990s, as
well as a voluminous edition of the Homeric scholia by
H. Erbse in the 1960s. In the cases of both the poetic text
of the Iliad and the scholiasts’ commentaries, these 20th
Century publications are works of selection and aggregation,
seeking to present a unified text of these ancient
works, representing the best judgement of the editors.
Subsequently, however, scholarly assumptions in many
circles about the nature of the poem and its commentaries
have changed, and the range of questions that scholars
would ask of these texts has expanded. The very existence
of variation in the text has become a matter of
historical interest (rather than a problem to be removed).
The precise relationship between text and commentary,
as expressed on the pages of individual manuscripts,
hold promise to shed light on the tradition that preserved
these texts, the nature of the texts in antiquity, and therefore
their fundamental nature. We have found that the
20th century models of critical text-plus-apparatus is incapable
of answering many of these new questions.
The best scholarly environment for addressing these
questions would be a digital library of facsimiles and
accompanying diplomatic editions. This library should
also be supplemented by other texts of related interest
such as non Homeric texts that include relevant comments
and quotations and other collections of data and
indices. Thus our focus on both collection of data and on
building a scalable, technologically agnostic, infrastructure
for publishing collections of data, images, texts, and
extensions to these types. This infrastructure accomplishes retrieval and linking through abstract citation.
(This work is complemented by, and has been progressing
in collaboration with, the work on Homer in the Papyri,
which is also building a collection of diplomatic
editions, to be supplemented by translations and commentaries,
on papyrus fragments of epic poetry.)
We have presented aspects of this collection and infrastructure
at previous meetings of the Digital Humanities
Conference. This paper will summarize the project’s
goals, but focus on recent developments, specifically
the ongoing publication of the Homeric Scholia, developments
in our network services (specifically the third
version of the Canonical Text Service and the RefIndex
service, both of which now exist as Java Servlets and as
Python applications running in the Google AppEngine),
and our end-user application, a web-based interface to
this library called “Pandect”.
Neel Smith has been compiling editions of the Homeric
scholia according to “new” principles that closely follow
the real evidence for these ancient commentaries. Smith’s
edition acknowledges that each of the Byzantine codices
in effect contains many discrete texts. The Venetus A, for
example, contains a text of Proclus’s Chrestomathy, the
text of the Iliad, summaries of the books of the Iliad, at
least four distinct scholiastic texts (as identified by their
placement in discrete locations on each folio) and later
notes and emendations. By describing these contents as
separate texts, and by using a system of canonical citation
to refer to portions of each text, and by using indices
to associate these texts with the collection of foliosides
that constitutes this manuscript, we can approach
the Venetus A as both a single artifact and as a notional
“library” of texts. By virtue of our FRBR-like citation
format, the CTS-URN, we can make general statements
about passages from any of these texts, while also retraining
the ability to treat each instantiation separately,
as when a scholion appears in almost (but not quite) the
same form on the A manuscript and the T manuscript.
The services that make this possible have been in development
for years and are now ready for use in production.
We have implementations of the Canonical Text
Services protocol—for discovery and retrieval of text by
means of arbitrary citations—in Java as a Servlet, to be
run under Tomcat or Jetty, and as a Python application
that can run in Google’s AppEngine space. Likewise the
RefIndex service, which permits generic access to indices
that allow simple pairings between texts (at any level
from the text-group, or author, level down to the citation
level or a specified substring), objects in a collection
(such as a collection of morphological data, a lexicon, or
a collection of manuscript folios), or images.
These two implementations allow us to offer these services
through local servers, for the greatest flexibility, or
through Google’s service, for global access and greatest
reliability.
Finally, Pandect is our open-source web-based application
for accessing the materials of the HMT. Its main
function is to mediate between the user and the network
services of the HMT, and as such is should be an entirely
generic tool, useful for any other project that implements
the CTS protocol.
By virtue of this citation+service approach to our digital
library, Pandect can discover texts, data, and images; it
can also provide basic navigation and manipulation. It
is not, however, merely a multi-column viewer for text
and images. Each instance of the application is based
on a Scenario, which defines relationships between collections
of texts, collections, and images. The scenario
might know that a collection contains data about manuscript
folios, and that these in turn are related to images
and to xml texts. The user’s experience, then, consists
of navigating a digital library in which each object in
view knows its relationship to all others. Navigation of
one object will percolate across all others. At any given
point, the user’s current view—for example, a Homeric
texts, a scholion, and images of two folios—is preserved,
can be addressed, and can be exported as a simple XML
expression of a directed graph (using the GraphML schema).
Because each of these objects is identified through
canonical citation, these digraphs capture the relationships
among scholarly objects; they can serve not only as
bookmarks to the state of a browser, but as independent
objects of analysis, aggregation, or manipulation.
After nine years of development, we hope to make the
case that the Homer Multitext has not only produced a
large body of valuable data, but also a robust body of
source code that could be broadly useful to the community
of digital humanists. Its approach to primary sources—
favoring diplomatic editions and facsimiles wherever
possible—intends to invite the widest possible scope
for re-use of its data. Its emphasis on simple indexing
rather than complex and specialized internal markup
is based on the assumption that it requires less knowledge
to integrate texts with simple markup and simple,
documented indices, than to disaggregate an elaborately
marked up texts that embeds links to other digital objects.
The HMT’s emphasis on canonical citation insures that
its contents can continue to interrelate with each other,
can be abstracted, and be re-used into the future. And our
emphasis on services defined by documented protocols should allow the HMT to advance in functionality and
reliability, and should allow other projects to draw on the
HMT’s contents through a consistent interface that is independent
of any specific technological implementation.

Full text license: This text is republished here with permission from the original rights holder.

Homer Multitext - Nine Year Update

1. C.W. Blackwell

2. D.N. Smith

ADHO - 2009