Metadata and Electronic Catalogues : Multilingual Resources for Scientific Medieval Terminology

paper, specified "short paper"
Authorship
  1. 1. Andrej Bojadziev

    St. Kliment Ohridski University of Sofia

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The Metadata and Electronic Catalogues project
(2004–06), a component of the Repertorium of Old
Bulgarian Literature and Letters, is designed to create electronic catalogues and authority files that will serve as integrated repositories of terminological information that has been developed and applied successfully in already existing projects in the realm of medieval Slavic languages,
literatures, and cultures. One innovative feature of this project is that beyond serving as a central repository for such information, it will expand the organizational
framework to support a multilingual superstructure along the lines of I18N initiatives elsewhere in the world of electronic text technology in general and humanities computing in particular.
This project is based on distinguishing the meanings of particular terms and notions that appear in the text of
medieval written documents both within the primary
corpus and in comparison to established scholarly
terminology (for example, medieval Slavic writers used
several terms—sometimes systematically and sometimes
not—to identify what modern scholars might call,
variously, a sermon, a homily, an instruction, etc.). This orientation is designed to support the development and implementation of software tools for finding multilingual
counterparts to both original (medieval) and scholarly (modern) terminology, and for sorting, searching, and mining this data in a way that is independent of both the entry and the query languages.
The project presumes the possibility (unique for such
initiatives in Slavic humanities computing, and uncommon
in humanities computing in general) of linking the standardized terminological apparatus for description, study, edition, and translation of medieval texts, on the one hand, to authoritative lists of key-words and terms used in bibliographic descriptions, on the other. This will allow the integration of scholarly meta-data and bibliographic references under a single unified framework. Another aim of the project is to create a mechanism for allowing the extraction of different types of indices based upon the imported documents even when the languages of
encoding may vary, so that, for example, Serbian-language
descriptions of medieval Slavic (not necessarily
Serbian) manuscripts could be collated with Bulgarian-
language descriptions of other medieval Slavic manuscripts
in a way that would enable automated systems (such as the plectogram-generating tool described in David J.
Birnbaum’s proposal for our panel) to recognize when entries that differ textually are nonetheless to be treated as the same. The primary manuscript description texts are encoded in a TEI-based XML format in the context of the broader Repertorium initiative, and their utility for the type of multilingual authority files, bibliographic
databases, and other broad reference resources illustrates
the multipurposing that is characteristic of XML
documents in the humanities, but on a broader scale than is usually envisioned (that is, going beyond the more common tasks of transformation into multiple presentation formats or exploitation in structured searching).
The Metadata and Electronic Catalogues project is based at the Institute of Literature, Bulgarian Academy
of Sciences, where the working team concists of
Anisava Miltenova (Institute of Literature, director of the project), Ana Stojkova (Institute of Literature), Andrej
Bojadzhiev (Sofia University), Margaret Dimitrova
(Sofia University), and Svetla Koeva (Institute of Bulgarian
Language BAS).
The outcome of the project is twofold:
1. The development of a database that will provide the terminological and bibliographic apparatus that will support the study of the medieval Slavic texts for both research and educational purposes.
2. The development of an on-line query interface that will enable this database also to serve as a sort of
independent encyclopedic reference to medieval Slavic written culture.
The Repertorium Intitiative: Computer Processing of Medieval Manuscripts
Anissava Miltenova
Institute of Literature, Bulgarian Academy of Sciences
The application of computer technologies to store, publish and—most importantly—investigate written sources belongs to the most promising tasks at the boundary
between the technical sciences and the humanities. The Repertorium Initiative was founded in 1994 at the
Department of Old Bulgarian Literature of the Institute of Literature of the Bulgarian Academy of Sciences in collaboration with the University of Pittsburgh (US). The Repertorium is a universal database that incorporates
archeographic, paleographic, codicological, textological,
and literary-historical data concerning the original and translated medieval texts distributed through Slavic
manuscripts between the eleventh and the seventeenth centuries. These data include both parts of actual texts
and the results of their scientific investigation, with
particular attention to the study manuscripts typology, a traditional aspect of philological scholarship that has been reinvigorated by the introduction, through the
Repertorium Initiative, of computational methodologies.
The descriptions and examples of real texts are based on the XML (Extensive Markup Language), an informatic standard that incorporates special “markup” characters within natural language texts. The markup tags demarcate certain parts of the texts (elements) and signal what the data represents, simplifying the identification and
extraction of data from the text not just during conversion
for rendering (the most common procedure in humanities
projects), but also during data-mining for analysis. The most recent model of description of manuscripts in an XML format derived from the TEI (Text Encoding
Initiative) guidelines has been developed by Andrei
Bojadzhiev (Sofia University), following five main
principles, formulated in the context of the project by David J. Birnbaum in 1994:
1. Standardizing of document file formats;
2. Multiple use (data should be separated from processing);
3. Portability of electronic texts (independence of local platforms);
4. Necessity of preservation of manuscripts in electronic form;
5. The well-structured division of data according to
contemporary achievements in textology, paleography
and codicology.
he working team in the Institute of Literature has already developed a digital library of over 350 electronic documents.
Since its inception as a joint Bulgarian-US project over ten years ago, the Repertorium Initiative has expanded to include a joint Bulgarian-British project describing
Slavic manuscripts in the collection of the British Library
(London), as well as a project with University of
Gothenburg (Sweden) concerning the study of late medieval
Slavic manuscripts with computer tools. The Repertorium Initiative has grown not only in terms of its geography
and its participants; it has also come to include a unique
set of possibilities for linking the primary data to a
standardized terminological apparatus for the description,
study, edition, and translation of medieval texts, as well as to key words and terms used in the bibliographic
descriptions. This combination of structured descriptions
of primary sources with a sophisticated network of
descriptive materials permits, for example, the extraction of different types of indices that go well beyond traditional
field-based querying.
In recognition of the ground-breaking achievements of The Repertorium Initiative, its directors and principal researchers were appointed in 1998 by the International
Committee of Slavists (the most important such international association) to head a Special Commission for the Computer Processing of Slavic Manuscripts and Early Printed Books. Other evidence of the achievements of this project include, the organization three international conferences (Blagoevgrad 1994, Pomorie 2002, Sofia 2005) and the publication by the Bulgarian Academy of Sciences of three anthologies (1995, 2000, 2003). The Internet presence of the Repertorium Initiative is located at http://clover.slavic.pitt.edu/~repertorium/ .
Because the Repertorium Initiative goes beyond manuscript studies in seeking to provide a broad and encyclopedic
source of information about the Slavic medieval
heritage, it also incorporates such auxiliary materials as bibliographic information and other authority files. In this capacity the Repertorium Initiative is closely coordinated
with three other projects: the project for Authority
Files, which defines the terms and ontology necessary for medieval Slavic manuscript studies; Libri Slavici, a joint undertaking of the Bulgarian Academy of Sciences and the University of Sofia in the field of bibliography on medieval written heritage; and identifying the typology of the content of manuscripts and texts with the aid of computational tools. All three of these share the common structure of the TEI documents and use a common XSLT
(Extensible Stylesheet Language for Transformations)
library for transforming documents to a variety of formats
(including XML, HTML [Hypertext Markup Language], and SVG [Scalable Vector Graphics]) thus providing a sound base for the exchange of information and for
electronic publishing.
The relationship among the three projects could be
described in the following way:
1. The Repertorium Initiative is a innovative from both philological and technological perspectives in its
approach to the description and edition of medieval texts. It takes its metadata for description from its Authority Files and its bibliographic references from the Libri Slavici.
2. The Authority Files project gathers its preliminary
information on the basis of descriptions and prepares
guidelines in the form of authority lists for the use the metadata by researchers.
3. Libri Slavici accumulates its data from various
sources, including descriptions and authority files, and shares common metadata with both of them.
4. Visualization of typology is radically new non-textual
representations of manuscript structures. This
development demonstrates that computers have done more than provide a new way of performing such
traditional tasks as producing manuscript descriptions.
Rather, the production of electronic manuscript
descriptions has enabled new and innovative
philological perspectives on the data.
The future of the Repertorium Initiative is to continue integrate into a network full text databases of medieval Slavic manuscripts, electronic description of codices, and electronic reference books with terminology. Preserving the cultural heritage of European libraries and archives,
it provides for data and metadata search and retrieval on the basis of paleographic, linguistic, textologic, and
historical and other cultural characteristics. The connections
among the different subprojects thus lead to a digital
library that is suitable for the use of a wide community
of specialists, and, in the same time, continues to inspire related new projects and initiatives.
References
Miltenova, A., Boyadzhiev, A. (2000). “An Electronic
Repertory of Medieval Slavic Literature and
Letters: a Suite of Guidelines”. In: Medieval Salvic Manuscripts ans SGML: Problems and Perspectives. Sofia:“Marin Drinov” publishing house, 44-68.
Miltenova, A. Boyadzhiev, A. Radoslavova, D. (2003). “A Unified Model for the description of the Medieval Manuscripts?”. In: Computational Approaches to the study of Early and Modern Slavic Languages and Texts. Sofia:“Boyan Penev” Publishing Center, 113-135.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ACH/ALLC / ACH/ICCH / ADHO / ALLC/EADH - 2006

Hosted at Université Paris-Sorbonne, Paris IV (Paris-Sorbonne University)

Paris, France

July 5, 2006 - July 9, 2006

151 works by 245 authors indexed

The effort to establish ADHO began in Tuebingen, at the ALLC/ACH conference in 2002: a Steering Committee was appointed at the ALLC/ACH meeting in 2004, in Gothenburg, Sweden. At the 2005 meeting in Victoria, the executive committees of the ACH and ALLC approved the governance and conference protocols and nominated their first representatives to the ‘official’ ADHO Steering Committee and various ADHO standing committees. The 2006 conference was the first Digital Humanities conference.

Conference website: http://www.allc-ach2006.colloques.paris-sorbonne.fr/

Series: ACH/ICCH (26), ACH/ALLC (18), ALLC/EADH (33), ADHO (1)

Organizers: ACH, ADHO, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None