Probing Digital Scholarly Curation through the Dynamic Table of Contexts

poster / demo / art installation
  1. 1. Susan Brown

    University of Guelph

  2. 2. Nadine Adelaar

    University of Alberta

  3. 3. Teresa Dobson

    University of British Columbia

  4. 4. Ruth Knechtel

    University of Alberta

  5. 5. Andrew MacDonald

    McMaster University

  6. 6. Brent Nelson

    University of Saskatchewan

  7. 7. Ernesto Peña

    University of British Columbia

  8. 8. Milena Radzikowska

    Mount Royal University (Mount Royal College)

  9. 9. Geoff Roeder

    University of British Columbia

  10. 10. Stan Ruecker

    IIT Institute of Design - Illinois Institute of Technology

  11. 11. Stéfan Sinclair

    McGill University

  12. 12. Jennifer Windsor

    University of Alberta

Work text
In this paper, we theorize about the role of the curator – or perhaps it would be more accurate to say “custodian” or even “collection designer” – in preparing electronic texts for use in an online digital environment. This scholarly role is becoming increasingly important to new forms of knowledge dissemination as a result of the growth in aggregating or mashing up existing digital content. The goal of these aggregations is to add value for particular purposes, as seen in initiatives such as the Journal of Digital Humanities, which aims to collect already published materials into quarterly thematic collections, and the related website Digital Humanities Now, which curates weekly the feeds of other digital humanities websites (Digital Humanities Now).

By analogy with the definition of a curator as “The officer in charge of a museum, gallery of art, library, or the like; a keeper, custodian” (“curator,” def. n. 6), the digital scholarly curator performs a similar role with respect to the circulation of digital content, though with a number of significant differences. Gallery or museum curators, for instance, have their own scholarly and professional training and preparation, often with respect to proper handling and display of fine art and other valuable physical objects. There are those who feel that at least some curators should be considered artists themselves (e.g. Ventzislavov). There is also widespread acknowledgment that curators who work with digital materials require a different set of competencies. Melody Madrid, for example, reports a study that resulted in a list of 20, divided into the categories operational and managerial. Digital scholarly curation also harnesses social media technologies (e.g. crowdsourcing, folksonomies) to encourage a new level of user participation in the management and preservation of digital content (Poole). Not quite an editor, but acting as a mediator between the producers of these contents and its audience through such activities as selecting and reframing them, the digital scholarly curator is in a rather unique position. Our discussion considers this role in relation to the Dynamic Table of Contexts (DToC) interface, a generalized tool for the dissemination of digital text that enables the designer of the collection to mediate specifically between the XML encoding of the text and the affordances that such encoding provides in the reading interface (Brown et al.; Dobson et al.).

The Dynamic Table of Contexts (Fig 1) is a joint initiative between the Interface Design research team of the Implementing New Knowledge Environments (INKE) project, the Canadian Writing Research Collaboratory (CWRC), the University of Alberta Press, and the Voyant Tools project. The DToC currently leverages four principal components of a digital text: the actual text of the document, a table of contents, an index, and XML encoding of the document. The goal of the prototype is to provide an online reading environment where the table of contents provides a conventional overview of a book while at the same time incorporating the index terms and XML tags and the text to which they point (Ruecker et al.). The terms and tags can be selected and deselected, providing interactivity between these three means for accessing and navigating the content of the text or collection.

Fig. 1: Dynamic Table of Contexts Interface

The ability to leverage XML markup as part of the navigational interface is a distinguishing feature of the DToC. Customization of that affordance is enabled by what we call the DToC’s “curator mode,” which is distinct from the reading mode in that it allows technically adept superusers to create customized tag lists to serve as navigational aids alongside the index terms, as well as to determine the organization of the table of contents (login is not required, the curated view is expressed through a unique URL). In a print edition, and in particular for anthologies, it is necessary for the editor to decide which of the various alternatives will be used to organize a particular table of contents. For example, a collection of essays might be organized

alphabetically by title
alphabetically by author’s last name
by theme
or by some other principle, in order to ensure a certain kind of development or coherence from beginning to end.
A collection of poems, for example, might add organization alphabetically by the first line of the poem, for cases where the poems do not have a title, and other arrangements are also possible, for instance by geographic location, language, or genre. In the case of an instructor preparing a course pack, the arrangement would naturally correspond to the sequence in which the materials will be used in the class. In the history of print, the possibilities for multiple representation of contents were limited.

In addition to the selection and organization of the contents themselves, curators of DToC collections need to make choices with respect to how the encoding works in the interface. Given the more generic and multi-purpose nature of XML encoding, particularly its use to structure a text, curators need to select which tags the DToC interface will display to the reader, and what user-friendly labels to use for those tags. For many purposes (although not all), the structural encoding can be set aside in favor of semantic encoding (when present). For choices among these tags, there is probably some golden mean that’s most appropriate for generic use; the primary benefit of the Dynamic Table of Contexts is to allow variants for more specific uses. Tags that have been rarely used may already be covered by the index, or may be too insignificant to take room on the list. On the other hand, in cases where the tags have been used heavily enough, it may not be useful to include them since it would result in too great a density of hits.

There are also large differences between what kind of curation of tags is required – depending on whether a schema or tagset has instead been applied throughout a collection, such as in the Brown Women Writers Project’s use of the Text Encoding Initiative – as opposed to highly customized versions of the TEI adapted to the needs of a particular text. Finally, the curator is also enabled to label how tags will appear in the interface, so that readers are not asked to decipher cryptic forms such as <biblStruct>, but are instead presented with labels for such tags that are meaningful in the context for which the curated text or collection is being prepared. This functionality is useful in cases where there is a nuanced difference between two similar tags that needs to be conveyed to the readers (Fig 2).

Fig. 2: Dynamic Table of Contents Curator Interface

What this means in terms of the training and qualifications of the DToC curator is that the person needs to hypothesize how the target readers will be dealing with the material, and to use the curatorial functions to customize the view of the text in the DToC interface to meet the anticipated use case(s). In the case of class instructors, to a certain extent the job will be simplified since the course pack has been chosen purposively for the class. In other situations, there may be multiple and possibly conflicting anticipated use cases. The curator also needs to be comfortable enough with XML not only to choose appropriate tags and rename them, but also, if needed, to specify XPath queries to locations in the document that cannot be identified through tag names alone. There will be considerable variation in XML expertise amongst curators, and a previous user study of ours on an earlier version of DToC indicated that a closer relationship to the encoding correlated with more positive experiences of the DToC interface (Dobson et al.).

The paper will frame our understanding of digital scholarly curation in relation to more traditional, historical understandings of curation and representation of contents and will demonstrate the curator mode in the DToC interface. Our previous user studies of the interface found both considerable confusion on the part of users with respect to the role of the encoding or tagging in the DToC (Brown et al.), and, among those who understood it, considerable emphasis on the importance of the XML markup and its ability to shape the reader experience in the interface. As one user said in mousing over the XML Markup pane: “Well, this seems to me the most relevant section, so whoever puts that together is pretty much the wizard in this Oz” (Dobson et al.). Our discussion will incorporate results of the next user study we are conducting on the DToC, with a stress on the curator mode, that will probe these findings which go to the heart of the DToC’s affordances. Our aim will be to gain a fuller understanding of the ways in which users with a range of technical knowledge understand the role of markup in the DToC interface, and their understanding as both readers and curators of the curatorial role. This study will inform our understanding not only of the ways in which reading environments such as the DToC can effectively leverage XML encoding, but more generally of the ways in which the idea of curation is rapidly evolving within the online scholarly environment.

