From lighthouse to framework: visualising digital scholarly editions with pathways and histories

paper, specified "long paper"
  1. 1. Nicholas John Hayward

    Loyola University, Chicago

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

From lighthouse to framework: visualising digital scholarly editions with pathways and histories
The Woolf Online project (1), now in its second phase of development at Loyola University Chicago’s Center for Textual Studies and Digitial Humanities, continues to address the following fundamental questions about the nature and development of literature:

­ how does a literary text come into being?
­ what kinds of influence are at work upon the writer during the process of initial composition, and thereafter?
The project has continued to investigate the various ways in which different recoverable histories of a particular text can be used to illuminate the process of its composition. By recording the history of a particular text, its cultural, political, and autobiographical contexts and their interaction, we can visually begin to answer some of the following important questions:

how is textual history related to other histories of a text?
­ what use does literary criticism make of textual and contextual histories?
With these goals in mind, we have developed and implemented digital tools and methods that allow the dynamic rendering of a literary text, which is then embedded within a contextual environment relevant to its known, original period of composition. For example, a user can view each day's writing for the Initial Holograph Draft within the context of Woolf's own diary entries and letters, newspapers of the day, and other historical data appropriate to the period. A user can compare different stages of the genetic edition of 'Time Passes', for example, and view earlier and later material that may have impacted upon or been influenced by the composition of 'Time Passes'. This material currently includes extracts from 'Sketch of the Past' and her 1905 journal. We have also delved in to the history of the Stephen Family, and the impact St Ives in Cornwall, and in particular Talland House, may have had upon the composition of this text.

As part of this onging development within the Center for Textual Studies and Digital Humanites, we have developed an extensible development and publication framework, called ‘Mojulem’ (2), for editing, publication, and visualisation of digital scholarly editions. Mojulem now allows us to build on the concept of ‘knowledge sites’, as suggested by Peter Shillingsburg (3) , supplementing a core publication framework with modules/plugins such as editors and image viewers. It is also able to host multiple projects within one installed framework, thereby enabling cross­project research and the option to aggregate specified data. Development of Mojulem, with the Woolf Online project as one of the ongoing working environments, has followed the need for four underlying core structures. These structures include CorPix, CorTex, CorCode, and CorForm, which are detailed as follows.

Manuscripts and printed texts materially unite the iconic and lexical, the autographic and allographic (4), whereas all digital representations separate these constituent elements into images and transcriptions. With Mojulem projects, the common default display reunites the image and transcription by mapping the one to the other at the pixel level. CorPix software currently includes eHinman (5) , Transparent (6) , TransparentOCR, Magnify, and Zoom. For example, pixel­level positioning and coordinate fixing is an inherent feature of both Transparent and TransparentOCR, within both editor and visualisation tools. eHinman is a digital adaptation of the original Hinman collator, and enables fade from the image of a page from one copy to another, thereby enabling a visual collation of multiple copies. Transparent is used within both the visualisation and editing stages of a project’s development, enabling an editor and user alike to view the image as the primary entry consideration for the project.

The CorTex is the stable resource containing the merged or compacted plain text transcriptions of the variant expressions of a work. It stores all information about text and variations, ready to be extracted for display of variation amongst versions; it is not necessary to recompute them. The CorTex is the entity to which all standoff properties (markup, annotations, links, etc.) points and on whose stability the system depends. It is as the source of each version’s text and variation from other texts. The stability and endurance of the CorTex is protected by multiplying duplicate copies locked with a digital signature, which verifies for each user that a CorTex copy is viable.

CorCode is the add­on value of analysis, argument, and explanation. Mojulem stores markup separately, as standoff properties, applying it as the user invokes it for the rendering of a specific item’s image or text within a given visualisation, such as a transparent view of a page of the Initial Holograph Draft of ‘To the Lighthouse’ (7) . To do this, Mojulem includes an editor which saves text and encoding separately, and filters for converting legacy, code­embedded transcriptions, including TEI encoded documents, into separate forms with markup analysed into properties, and filters for reversing this process.

A CorForm is a CSS stylesheet, containing special formatting rules, used to transform the overlapping properties of the CorCode into HTML. Each CorCode has a default CorForm, but other Corforms can be used in combination or as alternatives. Since a CorTex may have many CorCodes, and each CorCode many CorForms, structuring or formatting of the text can be attained by specifying some combination of already available resources, or by supplying new ones.

The CorPix, CorTex, CorCode, and CorForm are aggregated for a project within the Mojulem framework. Each such item is identified by a unique key, which is used as an index into the repository or database.

The development and combination of these four cores, within the modular and adaptable framework Mojulem, has allowed the second phase of the Woolf Online project to begin to approach the fundamental questions about the nature and development of literature, as outlined above. As such, this paper and presentation will outline and demonstrate visual methods and considerations for implementing such perceived networked histories. It will also discuss the different aspects of Mojulem, as outlined above, and ground such a discussion within the example of the ongoing Woolf Online project.

1 Project site is available at the following URL: 2 Our current test framework for the Woolf Online project, as of 6th March 2014, can be viewed at the following URL:

3 Shillingsburg, P. L. (2006). From Gutenberg to Google: Electronic Representations of Literary Texts: Cambridge University Press. 4 Nelson Goodman noted this distinction to separate art forms with a unique authentic form, for example painting and sculpture, from art forms which are performative with multiple copies, including writing and music.

5 An earlier stand­alone example can be viewed at the following URL:­dep/v6/

6 A test example page of the Initial Holograph Draft can be viewed at the following URLs: Image = =353&pos=21 and transcription = ntent=5136&pos=21

7 For example, Image = nt=385&pos=49 and transcription = ntent=5164&pos=49

8 An earlier TEI derived stand­alone example of this concept, using material from the Malory project, can be viewed at the following URL:

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info


ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from (needs to replace plaintext)

Conference website:

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO