Centre for Computing in the Humanities - King's College London
Like many people all over the world today, I’m
involved in a project to transform a scholarly work
from book form to online form. There is no longer any serious doubt about the value of such efforts, especially
when the book in question is a reference work; but
everyday scholarly experience shows that there are many questions about what sort of online form is best, and in a few cases reviewers have written analyses that detail the obstacles that can exist to the most advanced
applications (see, for example, Needham, “Counting
Incunables”; Daunton; Williams and Baker). One lesson of such reviews is how different the nature and uses of our various reference works are. All the same, some
reflections on the general questions involved can be productive; this talk seeks to cover the relatively familiar
issues of searching, and the less frequently discussed question of results display and organization.
Our book is the Index of English Literary Manuscripts
covering the period from 1450 to 1700, originally
compiled by Peter Beal and published in four volumes from 1980 to 1993. Its aim was to catalogue all known literary manuscripts of a selection of writers; it was
organized by author and work, not by the contents of manuscripts as is the norm. Those working on more modern
material expect to encounter manuscripts that are in an author’s own hand; but most of those in the Index are copies by other hands, often combined in miscellanies
with works of many other authors. Apart from facilitating
work on the individual authors who were covered, the Index also spurred work on the nature of textual
transmission in early modern Britain, where scribal
publication continued to be important despite the advent of print (see Love).
In making the case for the value of an online version of such a work, a standard claim is to point to the improved
access that can be provided. It is less expensive to
distribute the completed work online than as a set of hefty books, and it is also easier to find certain kinds of information in a searchable online publication. But as we know from the World Wide Web, even badly-done and inaccurate online resources can be put to use, so we need to look to some goal beyond this absolutely minimal one. As the analyses I’ve cited by Needham and others show,
many uses that scholars can imagine for indexes and
catalogues turn out not to be supported in online versions.
In their transformation into online resources, many
books go through a process of atomization into individual items, which may then be reunited in various ways:
so that information once accessible only through the
sequential order of the book, or through manually
constructed aids such as indexes, might now be available in many ways. But, of course, these alternative routes depend on the data and on the machinery used to work with it: problems searches involving difficult forms of information such as dates are by now very familiar,
and they stem from inconsistency of practice, lack of
sophistication in search machinery, and problems in
dealing appropriately with uncertainty.
In our case the information was compiled not only
following the usual sort of guidelines, but by only one person, so there is at least some chance of doing a reasonable job on this score. A further question is how well we can provide not just search results but an orderly view of a different perspective on the information. By this I mean
a display that is organized and focused following the desired point of view, rather than merely being a set of search results. We are familiar with the way that a search on minimally-structured full-text resources produces a set of results that need some working through: the scope of the resulting passages is often not clearly delimited and you need to read around to figure out for yourself what the relevant piece of text is. The sequence of such results is also approximate: experienced scholars learn not to pay too much attention to it. These are systems that do not focus on exactly the results called for, and do not organize the results in the best manner; and improving
on these matters in a full-text system is difficult. In a
catalogue, or anything else that is based on more readily atomized information, there should be more scope for building new perspectives and not just lists of results, but this is another problem that has often proven difficult to solve (see Needham, “Copy Description”). This is more than a mere question of agreeable presentation: as work in the field of visualization has shown, finding ways to present known data such that our minds can work on it is a powerful method, and one that we need more of in the humanities.
The printed Index of English Literary Manuscripts was an extreme example of a resource that offered only one perspective on information of interest from many
perspectives, since there was never an index to offer
alternative ways into the information. As examples will show, though, it is not always straightforward to build a display from a new perspective: in our case we are trying to preserve the original author/work perspective but also offer an organization by manuscript. That calls for more than just a reordering of entries: we find that the material
within entries needs (at the least) to be rearranged,
because even its organization expresses a perspective on the material. There may be a limit to the flexibility of this sort of catalogue, but an awareness of the issue when it’s being built can help us improve the design.
Beal, Peter, compiler, Index of English Literary
Manuscripts (London: Mansell, 1980–1993), 4
Daunton, Martin, “Virtual Representation: The History of Parliament on CD-ROM”, Past and Present 167, May 2000, 238–261.
Love, Harold, Scribal Publication in Seventeenth-Century England (Oxford: Clarendon Press, 1993).
Needham, Paul, “Copy Description in Incunable
Catalogues”, Papers of the Bibliographical Society of America 95:2, June 2001, 173–239.
Needham, Paul, “Counting Incunables: the IISTC
CD-ROM”, Huntington Library Quarterly 61:3–4, 1998, 457–529.
Williams, William Proctor, and William Baker,
“Caveat Lector. English Books 1475–1700 and the Electronic Age”, Analytical and Enumerative
Bibliography NS 12:1, 2001, 1–29.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Hosted at Université Paris-Sorbonne, Paris IV (Paris-Sorbonne University)
July 5, 2006 - July 9, 2006
151 works by 245 authors indexed
The effort to establish ADHO began in Tuebingen, at the ALLC/ACH conference in 2002: a Steering Committee was appointed at the ALLC/ACH meeting in 2004, in Gothenburg, Sweden. At the 2005 meeting in Victoria, the executive committees of the ACH and ALLC approved the governance and conference protocols and nominated their first representatives to the ‘official’ ADHO Steering Committee and various ADHO standing committees. The 2006 conference was the first Digital Humanities conference.
Conference website: http://www.allc-ach2006.colloques.paris-sorbonne.fr/