OpenText.org: An Experiment in Internet-based Collaborative Humanities Scholarship

paper
Authorship
  1. 1. Matthew Brook O'Donnell

    OpenText.org, University of Surrey Roehampton

  2. 2. Stanley E. Porter

    OpenText.org, University of Surrey Roehampton

  3. 3. Jeffrey T. Reed

    OpenText.org

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The OpenText.org initiative seeks to harness the collaborative effort
and ideas of scholars of Hellenistic Greek, particularly the Greek of
the New Testament, through the medium of the Internet. Modelled upon
the Open Source Software (OSS) movement, it seeks to actively nurture
the involvement of humanities scholars in the process of corpus
building and annotation, as well as the analysis of texts and the
development of tools for this analysis.

New Testament scholars have begun to realize the importance and
potential of computer resources both for traditional exegetical
analysis and for newer interpretative models such as discourse
analysis. Currently there are a number of grammatically and lexically
annotated texts available and accompanying software for
concordance-based search and retrieval. These texts and tools have
been developed by a small and devoted number of practitioners over
the past 25 years. These developers have tended to follow a closed
method of development, following what Eric Raymond (1999) has
described as the 'cathedral style' of project building. This has
limited the degree of participation for interested New Testament
scholars, who have tended to fill the role of product consumers. The
OSS movement adopts a different viewpoint on the role of users,
seeking to facilitate their role as co-developers. This goal is
achieved through open access to regular updates of source code and
the use of Internet collaboration tools--through a mailing list,
newsgroup or bulletin board (Udell 1999; Preece 2000). The contention
of OSS enthusiasts, such as Raymond, is that software of quality
equal or superior to that of commercial closed source products can
result from such a process.

The OpenText.org project contends that the OSS model of development
can be adapted to the realm of textual annotation and analysis. These
are tasks that require detailed and time consuming analysis, yet hold
long-term benefits for the whole scholarly community. Biblical
scholars have tended to work independently in their study of texts,
carrying out a great deal of linguistic and literary analysis
summarized in their publications but not accesible to other scholars
for future work. The Text Encoding Initiative has demonstrated the
position of textual encoding as a valuable academic discipline in and
of itself, rather than just a preparatory exercise (Sperberg-McQueen
1991; DeRose et al. 1990; Renear, Myloans and Durand 1996). One of
the goals of OpenText.org is to develop a series of specification
documents to act as guidelines for the linguistic and literary
annotation of Hellenistic Greek texts. These specifications are
developed following the editorial process of the World Wide Web
Consortium. These documents define XML schemas that can be used by
scholars to mark-up a particular text or section of text. They are
then encouraged to contribute the resulting document(s) back into the
data repository, making them available for use and adaptation by
other scholars. We are also exploring the possibilities of on-line
annotation, allowing the logging, co-ordination and editorial review
of the work carried out by users. The eventual goal of this
arrangement is the full annotation (with linguistic, literary, text
critical and contextual information) of a large corpus of Hellenistic
Greek texts (O'Donnell 1999 and 2000). In addition, OpenText.org
draws upon the insights of corpus linguistics and functional
discourse analysis (Reed 1997; Porter and Reed 1999; Porter and
O'Donnell 2000) to provide a theoretical basis and systematic model
for the annotation and analysis of texts in a corpus.

This paper will provide an overview of OpenText.org and an outline of
the principles behind the project. It will also describe and
demonstrate the progress of the project in harnessing the
collaborative potential of the virtual scholarly community during the
first nine months. In addition, an analysis of some of the key issues
faced in the project, such as, the difficulties in overcoming the
individualistic practices of many humanities scholars, the fears
concerning the loss of intellectual property, the use of an XML
encoding scheme and the adoption of these schemes by non-technical
scholars, and issues of copyright of ancient texts and editions.
Aside from the sociological problems of building an on-line
collaborative community, two key problems have become clear. The
first concerns the legal and copyright issues surrounding both
printed and electronic editions. The OSS movement makes use of a
number of software licences (GPL, Apache, BSD) to protect the free
distribution and reuse of the software it produces. OpenText.org is
involved in producing new editions of Hellenistic texts, particularly
the New Testament according to Codex Sinaiticus. There is some
disagreement as to how 'open source' licences can be applied to
machine-readable texts. The second difficulty relates to the high
entry level set for participation in the project. It requires at
least three elements: (1) a reasonable facility in the Hellenistic
Greek language, (2) an acceptance and understanding of linguistics
(for the linguistic analysis and annotation of texts) and (3) comfort
with XML encoding. The first of these cannot easily be removed. The
development of encoding standards and specification documents
addresses the second. The use of XML editors and web-applications
with a clear user interface can partially address the third issue.

This paper supports the view that the OSS process, when properly
understand (including the different types of participants and the
roles they fulfil) and adapted, is highly applicable to
computer-based humanities projects.

References
----------

DeRose, S.J., D.G. Durand, E. Mylonas, and A.H. Renear, 'What is
Text, Really?', Journal of Computing in Higher Education 1.2 (1990):
3-26

O'Donnell, M.B., 'The Use of Annotated Corpora for New Testament
Discourse Analysis: A Survey of Current Practice and Future
Prospects', in Porter and Reed (eds.) 1999: 71-116.

O'Donnell, M.B., 'Designing and Compiling a Register-Balanced Corpus
of Hellenistic Greek for the Purpose of Linguistic Description and
Investigation', in S.E. Porter (ed.), Diglossia and Other Topics in
New Testament Linguistics (JSNTSup, 193; Sheffield: Sheffield
Academic Press), pp. 255-97.

Porter, S.E. and M.B. O'Donnell, 'Semantics and Patterns of
Argumentation in the Book of Romans: Definitions, Proposals, Data and
Experiments', in S.E. Porter (ed.), Diglossia and Other Topics in New
Testament Linguistics (JSNTSup, 193; Sheffield: Sheffield Academic
Press), pp. 154-204.

Porter, S.E. and J.T. Reed (eds.), Discourse Analysis and Other
Topics in Biblical Greek (JSNTSup, 113; Sheffield: Sheffield Academic
Press, 1999).

Preece, J., Online Communities: Designing Usability, Supporting
Sociability (New York: John Wiley, 2000)

Raymond, E.S., The Cathedral and the Bazaar: Musing on Linux and Open
Source by an Accidental Revolutionary (Cambridge, MA: O'Reilly &
Associates, 1999).

Reed, J.T., A Discourse Analysis of Philippians: Method and Rhetoric
in the Debate over Literary Integrity (JSNTSup, 136; Sheffield:
Sheffield Academic Press, 1997).

Renear, A., E. Mylonas, and D. Durand, 'Refining our Notion of What
Text Really Is: The Problem of Overlapping Hierarchies', in S. Hockey
and N. Ide (eds.), Research in Humanities Computing 4: Selected
Papers from the ALLC/ACH Conference, Christ Church, Oxford, April
1992 (Oxford: Clarendon Press, 1996): 263-80.

Sperberg-McQueen, C.M., 'Text in the Electronic Age: Textual Study
and Text Encoding, with Examples from Medieval Texts', Literary and
Linguistic Computing, 6 (1991), pp. 34-46.

Udell, J., Practical Internet Groupware (Cambridge, MA: O'Reilly &
Associates, 1999).

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ACH/ALLC / ACH/ICCH / ALLC/EADH - 2001

Hosted at New York University

New York, NY, United States

July 13, 2001 - July 16, 2001

94 works by 167 authors indexed

Series: ACH/ICCH (21), ALLC/EADH (28), ACH/ALLC (13)

Organizers: ACH, ALLC