A humanities scholar contemplating computer-aided research on texts in digital format must pay careful attention to the source of those texts, lest inaccuracies and corruptions mar the analysis. Likewise a teacher wishing to employ electronic texts in the classroom needs to check that the text of the work in question meets certain standards of reliability. In response to these needs a number of well-funded projects have emerged, aimed at creating high-profile literary or linguistic databases, of which the electronic Oxford English Dictionary is probably the most well-known. A number of equally ambitious projects are currently underway, for example Jerome McGann's Rossetti Archive at Virginia and Peter Robinson's Canterbury Tales Archive at Cambridge. In these cases the undertakings have had the scholarly leadership and institutional funding to support the creation of databases that will withstand close scrutiny.
Most of the etexts now residing on Internet archives, however, have emerged from the efforts of modest, low-budget operations, and to date, little concerted action has gone into addressing the problems and issues faced by these smaller projects. Some of the currently-archived texts were originally put together as part of a specific study, and thus have been subject to various emendations called for by the requirements of the project itself. Still others have been scanned or typed into digital form by etext enthusiasts and lack the rigorous (and tedious) checking necessary to eliminate errors introduced in the act of entry. In this second group the print analogue used as text source may itself be an inexpensive reprint or abridged "popular" edition, lacking the editorial rigor now applied to the creation of most scholarly editions. Furthermore, a significant number of texts have been systematically stripped of all marks of editorial intervention in preparation for mass circulation.
Through the well-known work of the Text Encoding Initiative, and more recently the MLA committee to establish guidelines for the preparation of electronic texts, the humanities computing community has begun to produce a practical framework of common standards around which one might design a humanities database project. The TEI P3 in particular offers a large number of quite useful solutions to encoding problems, and the MLA guidelines will establish techniques for ensuring textual rigor. With work on these aspects of electronic text development well underway, we must next face the task of incorporating the emerging standards into small-scale projects, ones that frequently lack the recommended resources necessary to fully implement these standards.
This panel seeks to explore ways in which humanities database projects can maintain the high level of textual and intellectual integrity achieved by well-funded organizations while working within the limitations of time and financial resources experienced by most small-scale undertakings. The speakers will discuss their current projects, the problems they faced in maintaining high standards on low budgets, and the administrative and technological innovations they developed in response to those problems. Drawing upon practical examples stressing process they will illustrate database-specific implementations of the TEI P3, conversion of data compiled in smaller, proprietary applications to the broad SGML platform, synthetic integration of existing utilities and applications for text processing, relative feasibilities of product-delivery systems, and the challenge to database designers when dealing with widely-varied classes of data. They will additional offer suggestions on strategies for organizing administrative structures that enhance modest human resources, as well as techniques for fund-raising.
If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.
Hosted at University of Bergen
June 25, 1996 - June 29, 1996
147 works by 190 authors indexed
Scott Weingart has print abstract book that needs to be scanned; certain abstracts also available on dh-abstracts github page. (https://github.com/ADHO/dh-abstracts/tree/master/data)
Conference website: https://web.archive.org/web/19990224202037/www.hd.uib.no/allc-ach96.html