Recent Projects and Products based on Open Source Software from Maryland Institute for Technology in the Humanities (MITH)

poster / demo / art installation
  1. 1. Susan Schreibman

    Maryland Institute for Technology and Humanities (MITH) - University of Maryland, College Park

  2. 2. Amit Kumar

    Maryland Institute for Technology and Humanities (MITH) - University of Maryland, College Park

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

This poster session will be devoted to three projects developed at MITH over the past year and a half which use open source software:
· Early Americas Digital Archive <>
· Steinschneider Bibliographic Database
· Versioning Machine – Version 2.0 <>
One of the things that binds these three projects together, is that they are built using freely-available open source software. All three projects are XML/TEI-based. Two of the projects (Early Americas Digital Archive (EADA) and The Steinschneider Bibliographic Database) utilize eXist, a native XML database, for document delivery. The third project, the Versioning Machine (VM), is a MITH product developed to display multiple versions of deeply encoded text. The VM debuted at ACH/ALLC 2002.
These projects represent three different approaches to project design. EADA is a text-based archive featuring full text that is lightly encoded, along with a robust metainformation scheme developed to facilitate searching beyond what was present in the text. The architecture of The Steinschneider Bibliographic Database utilizes a hybrid model of both XML and relational databases. These full-text archives are designed in a similar way in that a public web presence is delivered dynamically via the database. We also construct a separate administrative interface in which project PIs can upload data. The Versioning Machine, on the other hand, was developed as a product that provides an environment for encoders to display their TEI-encoded texts through an interface generated by XSLT and JavaScript.
The Projects in Detail
Early Americas Digital Archive is a full-text archive based on a TEI DTD. Its goal is to make available primary materials written in or about the Americas between 1492 and 1820. While the markup within the <body> of each text is lightly encoded (by and large only structural encoding), the <teiHeader> contains a rich controlled vocabulary whose fields can be combined to perform very specific searches. For example, it is possible to search for all prose texts written about the Chesapeake area. The archive uses a servlet container in tandem with the eXist XML database. Both the servlet container and the XML database are open source software. The archive was launched in October 2003, and has received a positive response from the community for which it was designed.
The Steinschneider Bibliographic Database is the most ambitious project we have undertaken to date. It is a three-year project now in its second year. The goal of the project is to translate from the German, and update Moritz Steinschneider’s 1885 definitive history and catalogue of Hebrew translations of philosophy, science, medicine, and bells letters, mostly from Arabic and Latin, Die Hebraeischen Ubersetzungen des Mittelalters und dir Juden als Dolmetscher (The Hebrew Translations of the Middle Ages and the Jews as Interpreters). The translated text is being encoded in TEI. The text is being linked through unique IDs to items in one of two bibliographies the editors are compiling (one for printed texts and the other for manuscript sources). The editors, one in the US, one in Israel, and the other in Germany, all need access to the database (which is housed on the MITH server) for data entry and retrieval Editors enter data into the bibliographies through web forms which generate TEI-conformant xml. The XML data is being delivered through eXist. To do faster searches we are also utilising a relational database to store name and bibliographic keys which are then referenced by the TEI-encoded text.
The goal of the VM was to create an online environment to critically edit texts which exist in multiple versions or witnesses. When the Versioning Machine was first developed, encoders needed to utilise TEI’s parallel segmentation method to encode the separate witnesses. The XSLT then parsed the TEI text giving the illusion of individually encoded texts. In response to feed back from our initial release, the VM project team decided to create an alternative encoding method, which was released in Version 2.0 – that of allowing witnesses to be encoded discreetly. This method of encoding allows projects to use already encoded TEI documents in the VM architecture, and in this case, the XSLT binds the witnesses together. There are also other improvements to the software that we will be discussing, including a more robust image applet which allows users to view all the images associated with a witness set in one window.
Amit Kumar, MITH’s programmer, and Susan Schreibman, Assistant Director, will use the poster session to discuss project architecture and management with ACH/ALLC participants. In the case of the Versioning Machine, we wanted an opportunity to discuss improvements to the software, and get informal feedback from ACH/ALLC participants who have downloaded it.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info



Hosted at Göteborg University (Gothenburg)

Gothenborg, Sweden

June 11, 2004 - June 16, 2004

105 works by 152 authors indexed

Series: ACH/ICCH (24), ALLC/EADH (31), ACH/ALLC (16)

Organizers: ACH, ALLC