Translation Studies and XML: Biblical Translations in Byzantine Judaism, a Case Study

poster / demo / art installation
  1. 1. Eleonora Litta Modignani Picozzi

    King's College London

  2. 2. Elena Pierazzo

    King's College London

  3. 3. Julia G. Krivoruchko

    Cambridge University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

All defi nitions of translation describe this process as involving
two or more texts running in parallel that are considered to
be, in a sense, equivalent to each other. When producing a
translation, a source text is divided into syntactic units and each
of them is then translated. The translation can be either literal,
i.e. it mirrors the structure of the original text very closely, or
free, i.e. it ignores the original structure and translates freely.
Because languages diverge greatly in their syntax, the structure
of a language A can not be fully mapped on a language B, since
the outcome in B may be incomprehensible. Besides, cultures
differ greatly as to the degree of freedom/literalness tolerated
in translation.
In dealing with translations compiled in antiquity and the
Middle Ages, we are in a sense trying to discover how a specifi c
culture understood the source text, the grammar of the source
language and the usability of the translation product.
Even though a number of XML-based projects involving
translated texts have been to date proposed to the attention
of the community,1 a model able to describe the precise
relationship between source and target texts is still required.
Such issues have been dealt with at the Centre for Computing
in the Humanities (King’s College, London) in relation to a
research project involving Biblical translations. The analysis
process resulted in an encoding model that could solve
problems specifi cally linked to this project. Yet the model has
the potential to be generalized and adapted to work in other
translation-based projects.
The Biblical translations are by far the most studied in the
world. The text of the Hebrew Bible used in Hellenistic period
was written down in a purely consonantal script without
vowels, which left a large margin for differing interpretations.
In addition to this, the Hebrew manuscript tradition was far
from being homogenous. As a result, a number of translations
of the Hebrew Bible into Greek emerged, some of them
differing substantially from each other.
Until recently, the assumption was that Jews abandoned
their use of Greek Biblical translations, since these were
adopted by the Church. In particular, they were supposed to
ignore the Septuagint, which was recognized as a canonical
and authoritative text by Eastern Churches. However,
the manuscripts found in Cairo Genizah have shaken this
assumption. During the 20th century, new Biblical translations
made by Jews into Greek during Ottoman and Modern period
were discovered.2
The Greek Bible in Byzantine Judaism (GBBJ) Project3 aims to
gather textual evidence for the use of Greek Bible translations
by Jews in the Middle Ages and to produce a corpus of such
translations. Unfortunately, the majority of GBBJ evidence
consists not of continuous texts, but of anonymous glossaries
or single glosses mentioned by medieval commentators.4 The
functioning of continuous texts at our disposal is also unclear.
Further challenges arise from the peculiarities of the writing
system used by the Byzantine translators. Since approximately
the 7th-8th century AD, Jews stopped using the Greek
alphabet and switched instead back to the Hebrew one. In
order to unambiguously represent the Greek phonetics,
the Hebrew alphabet was often supplied with vowel signs
and special diacritics. Some manuscripts contain neither or
use them inconsistently. In order to decode the writing an
IPA reconstruction is therefore essential. However, Hebrew
writing occasionally results in better refl ecting the current
medieval pronunciation of the Greek language. For what
the linguistic structure is concerned, while in general Greek
Jewish Biblical translations use the grammar and lexicon of
the mainstream Greek, in some cases the translators invent
lexical items and employ unusual forms and constructions,
trying to calque the maximal number of grammatical features
from one language into another. Few of the resulting forms are
diffi cult or even impossible to understand without knowing
the Hebrew source. To trace the features transferred and the
consistency of transferring, the tagging of features is necessary.
Therefore, lemmatization and POS-tagging both of the source
and the target texts constitute an essential component for the
research project.
The two main outcomes of the project will be a printed and a
digital edition; the latter will allow users to query and browse
the corpus displayed in parallel verses from the source and
the target texts. The target will be readable in two alphabets:
Hebrew and transliterated Greek.
In designing the encoding model, we have tried to follow TEI
standards as much as possible, e.g. elements for the description
of metadata, editorial features and transcription of primary
fonts have been devised according to P5 guidelines since the
beginning. Yet for what the textual architecture is concerned,
TEI P5 does not include a model that would fi t the project’s
needs, hence we have chosen to start working on the encoding
model on the basis of a custom made DTD rather than a TEI
compliant schema. As a result, the tailored encoding model is simpler to apply for
the encoders. However, all the elements have been mapped on
TEI P5 for interchange and processing purposes.
The encoding model works on three different layers:
a) Structural layer. In any translation study, at least two
texts need to be paired: the source and the target. The
GBBJ project focuses on the target text, and therefore it
constitutes our main document. This text was described
as consisting of segments, within biblical verses, defi ned
on palaeographical grounds. Since the Greek texts we are
dealing with are written in Hebrew characters, they have
to be transliterated into Greek characters and normalized
within what we called the <lexicalTranscription>. The
translational units are then compared to their presumed
sources (the Masorethic Text, in our case) in <coupledPairs>
and fi nally matched with the translations from other Biblical
traditions (Septuagint and Hexaplaric versions), called
<equivalents>. The latter are further connected to their
respective apparatuses. There is a possibility to compare
between the version of Hebrew lexeme as it appears in the
manuscript and the Masoretic Text or its apparatus.
<verse MT=”Koh 2:19” LXX=”Koh 2:19”>
<segment type=”translation”>
<transcription language=”Greek”
<seg> יִקְׁשֹוניִי </seg>
<GreekWord dictionaryForm=”γινώσκω”
<verb person=”third”
number=”singular” tense=”present”
voice=”active” mood=”indica
<target><ref intRef=”Koh2-
<source><ref exRef=”MT_
<LXX exRef=”LXX_Eccl2-19_3”></LXX>
In order to keep the document as semantically clean and
coherent as possible, we have devised a way of externalising
connected information in several “side” fi les. Each of them
contains a different Biblical text: the source (MT), the
Septuagint and Hexaplaric variants. The respective apparatuses
are encoded in separate documents and connected to the main
GBBJ document through a link in the <equivalents> element
within the <coupledPairs> section. Establishing a relationship
between the GBBJ target text and other Greek translations is
not only important for diachronic linguistic purposes, but also
for the study of textual history of the GBBJ translations.
For these external fi les, we have devised a specifi c but simple
DTD (based on TEI) which allows the parallel connection with
the main text.
b) Linguistic layer. In translation studies it is important
to analyse not only semantic relationship between the
words, but also their morphological correspondence.
Lemmatisation and POS-tagging have therefore
been envisaged for both the GBBJ document (within
<lexicalTranscription>) and the external fi les, connected
via IDs. Each segment in the source text can be paired
both semantically and morphologically with any of its
counterparts, allowing complex queries and the generation
of correspondence tables.
c) Editorial and codicological layer. The GBBJ text derives
directly from a primary source, which means that
information needs to be given on all editorial elements:
expansions of abbreviations, integrations, corrections, etc.
The physical structure of the document was also described
on a very granular level including column breaks, line breaks,
folio breaks, marginal notes, change of hands, spaces and
The present case study demonstrates the wide range of possible
applications of an XML framework to translation studies. The
Greek Bible in Byzantine Judaism Project presents a number
of problems that are likely to be encountered in other similar
projects, such as an alphabet not suited to a specifi c language
and the existence of wide comparable corpora of translational
traditions. Although some of the solutions found are specifi c to the research project, the approach and the conceptual
model used here may be reused and adapted within the digital
humanities community.
1 For example: The Emblem Project Utrecht (http://emblems.let., accessed 23/11/07), the English-Norwegian Parallel
Corpus (
enpc/, accessed 23/11/07).
2 For general discussion see Fernández Marcos (1998). Outline of the
problems and suggestions for future research: RASHI 1040-1990.
3 A three year AHRC project based at University of Cambridge
(Faculty of Divinity) and King’s College, London (Centre for
Computing in the Humanities); see project website at http://www. (accessed 7/3/2008).
4 For glossaries see De Lange (1996); Rueger (1959); Tchernetska,
Olszowy-Schlanger, et al. (2007). For a continuous text see Hesseling,
Burnard, Lou and Bauman, Syd (2007), TEI P5: Guidelines for
Electronic Text Encoding and Interchange, at http://www.teic.
org/release/doc/tei-p5-doc/en/html/index.html (accessed
De Lange, N. R. M. (1993). The Jews of Byzantium and the
Greek Bible. Outline of the problems and suggestions for
future research. RASHI 1040-1990. Hommage à Ephraïm E.
Urbach. ed. G. Sed-Rajna. Paris, Éditions du Cerf: 203-10.
ID. ( 1996). Greek Jewish Texts from the Cairo Genizah, Tübingen:
Fernández Marcos, Natalio. (1998). Introducción a las
versiones griegas de la Biblia. Madrid: Consejo Superior de
Investigaciones Científi cas, ch.;
H. P. (1959). “Vier Aquila-Glossen in einem hebraischen
Proverben-Fragment aus der Kairo-Geniza.” Zeitschrift für die
Neutestamentlische Wissenschaft 50: 275-277;
Hesseling, D. S. (1901). “Le livre de Jonas.” Byzantinische
Zeitschrift 10: 208-217.
RASHI 1040-1990. Hommage à Ephraïm E. Urbach. ed. G. Sed-
Rajna. Paris, Éditions du Cerf: 203-10.
Tchernetska, N., Olszowy-Schlanger J., et al. (2007). “An Early
Hebrew-Greek Biblical Glossary from the Cairo Genizah.”
Revue des Études Juives 166(1-2): 91-128.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2008

Hosted at University of Oulu

Oulu, Finland

June 25, 2008 - June 29, 2008

135 works by 231 authors indexed

Conference website:

Series: ADHO (3)

Organizers: ADHO

  • Keywords: None
  • Language: English
  • Topics: None