Chasing DTDs. The Digital Edition of the ‘Repertorium Biblicum Medii Aevi’

  1. 1. Sabine Harwardt

    Universität Trier

  2. 2. Stefan Buedenbender

    Universität Trier

The Repertorium Biblicum Medii Aevi, edited between 1950 and 1980 by Friedrich Stegmüller and Klaus
Reinhardt in 11 volumes, is a major reference work for studies in various medieval disciplines such as
theology, history, philology, and philosophy. Within its approximately 12,000 catalogue entries, listing almost
24,000 commentaries, the ‘Repertorium’ includes all commentaries of the Bible that have been written until
1500. The single volumes, published at the ‘Consejo Superior de Investigaciones Cientificas’ (Madrid), have
been subdivided into Apocryphes, Commentaries of known and unknown authors, Supplements, and Glossa
ordinaria. Thus, the ‘Repertorium’ contains the largest part of the medieval commentary tradition of the Bible.
In July 2002, a team of philologists and computer scientists started to prepare a digitized version of
the ‘Repertorium Biblicum Medii Aevi’ taking into account scholarly demands in order to open up new
possibilities for tackling this enormous amount of valuable data.
In the printed edition of the ‘Repertorium Biblicum’, the extensive material is structured by the registers of
the incipits (beginnings) of a given commentary. However, apart from these incipits, the ‘Repertorium’ also
contains short biographies of its authors, bibliographies of research literature, dates, commented works, and
the manuscript tradition as well as the tradition of early prints.
The electronic edition of the ‘Repertorium’ will enable all users to search within these and further
categories such as specific manuscripts, libraries and archives, commented Bible-books, various types of
commentaries, specific authors, or dates of special importance. Moreover, many supplements and corrections
collected after the publication of the printed ‘Repertorium Biblicum’ will be integrated into the digital
‘Repertorium’; scholars will thus have the possibility to work with the updated material on-line.
Currently the data input is taking place; it will be completed in January 2003. The 2nd volume of the
‘Repertorium Biblicum’ has already been digitized and marked up to a certain degree in order to analyze its
structure and to develop tagging routines which can be easily adapted to the structure of the other volumes
The structure of the ‘Repertorium Biblicum’ is very complex. It can be described as an entry catalogue of
authors and/or of titles which are given in alphabetical order. In general, an entry has three different parts: 1.
work, 2. chronological data and short bibliography, and 3. commentaries of that specific work. An entry can
consist of part 1 and part 2 only; if part 3, which can be repeated several times, is given at all, part 2 can be
omitted. For inserting the markup, these three parts, with the logical order of an entry, have to be
distinguished thoroughly.
Part 3 is the most complex one to be encoded. It consists of two subdivisions, namely of quotations
taken from a specific work in the form of incipits or explicits, and of sometimes lengthy lists of manuscripts
and editions of the work quoted. Whereas all entries follow the strict alphabetical order and show dictionary
patterns (macrostructure), the information on manuscripts given in part 3 rather resembles a repository guide
or an archival finding aid (microstructure)—a fact that forces one to combine two different logical structures.
On the basis of a thorough document analysis, we fixed criteria for the markup and encoded the 2nd
volume of the ‘Repertorium’ using an internally defined tag-set. The encoding has been carried out using
TUSTEP programs. Being designed to cope with mass data, TUSTEP disposes of very powerful and highly
specific algorithms which allow to search large amounts of text for recurring patterns. We can than—within
TUSTEP—develop our own procedures to automatically copy, modify and sort these patterns according to
our individual demands. In the future, the internal tag-set could easily be replaced by a standardized DTD
such as TEI or MASTER.
Since the development of a new DTD is a very expensive and time-consuming process, we wanted to check
whether DTD schemes already in existence could easily be applied to the ‘Repertorium’, taking into account
its varying structures. In encoding the ‘Repertorium’, the markup of its dictionary structure is as important as
the markup of its archival finding aid patterns. We therefore examined the TEI DTD, which has been
successfully applied in various projects at Trier University (cf. Panel Session “Into the Depths of Data.
Methods of Subject Specific Content Retrieval” at the ALLC/ACH-Conference 2002 in Tübingen), Docbook,
with its rich options for bibliography encoding, and, especially concerning the archival character of the
‘Repertorium’, EAD (Encoded Archival Description) and MASTER.
Having implemented various types of markup, it seems as if Docbook’s possibilities for covering
complex technical writing and simple bibliographies were too restricted for encoding the ‘Repertorium
Biblicum’. With TEI, the ‘Repertorium’ could be fully encoded by using a great quantity of particularly
specified attributes. This could, however, complicate the proper markup and its understanding.
It is not always possible to describe the ‘Repertorium’s structure according to the DTDs especially
created for the markup of archival material or manuscript archives; e.g. the ‘Repertorium’ gives the relevant
bibliographical data after incipits are quoted etc. While EAD partially covers the criteria of alphabetical order
combined with various quotations, its logic does not follow the listing of manuscripts in the third part of a
‘Repertorium’s entry. The definition of MASTER’s <msDescription>-tag, that would fully cover the complex
structure of the repository guide part, does not allow one to list several manuscripts after quotations,
requesting instead a strict sequence of manuscripts.
At the moment, TEI, MASTER, and EAD are being benchmarked against each other by encoding the
2nd volume according to these standards. MASTER seems to be most appropriate for encoding; however,
some questions remain. Our paper will discuss the final decision for MASTER confronting the three DTDs’
By now, the 2nd volume has been encoded according to an internal markup derived from the document
analysis. As mentioned above, this markup could easily be replaced by tags compliant to one of the possible
DTDs (MASTER, TEI, EAD, Docbook). At the time of the conference, the majority of the ’Repertorium
Biblicum’ will be fully encoded (probably according to MASTER); it will also be partly accessible for varied
on-line research. Questions to be discussed when presenting the ‘Repertorium Biblicum’ may focus on
problems of the applicability of defined frameworks like TEI to complex documents with an inconsistent and
often irregular structure.
The Cambridge History of the Bible vol. 2: The West from the Fathers to the Reformation. Ed. G. W. H.
Lampe (Cambridge 1969).
Klaus Reinhardt - Horacio Santiago Otero, Biblioteca Bíblica Ibérica Medieval (Madrid 1986).
Klaus Reinhardt: La Biblia en la Península Ibérica durante la edad media (siglos XII-XV): el texto y su
interpretación (Coimbra 2001).

