Designing the next big thing: Randomness versus serendipity in DH tools

paper, specified "long paper"
Authorship
  1. 1. Kim Martin

    Western University (University of Western Ontario)

  2. 2. Anabel Quan-Haase

    Western University (University of Western Ontario)

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

A number of recent initiatives within the DH community promote the design, development, and implementation of digital tools aimed at speeding up, clarifying, or otherwise improving the research practices of humanities scholars. This year, the One Week | One Tool (OWOT) summer institute, funded by the National Endowment for the Humanities, resulted in the creation of Serendip-o-matic, a serendipity engine for digital research. This tool relies on users to feed it a selection of text or citations in order to create a list of keywords, which it then uses to find related information. The documents returned are taken from the Digital Public Library of America (DPLA), Europeana, and Flickr1. The participants of the 2013 OWOT initiative are not alone in their quest to design a digital tool geared toward enhancing the chance encounter with information, resources, ideas, research materials, and even people. Tim Sherratt, the manager of Trove at the National Library of Australia, often includes an element of chance in the tools he designs for use in the humanities. For instance, in his tool Trove News Bot, Sheratt (2013) allows users to interact with a Twitter stream by sending tweets with directions (such as #luckydip), which will return random results from the National Archives of Australia’s digital collection2. Similar tools have been developed that introduce serendipity into the collections of the DPLA and the British Library.
One motivation for the development of digital tools aimed at enhancing serendipity in digital environments comes out of the need to redesign and recreate the complexity of the research environment found in library stacks and archival collections. It is often argued that this complexity may be lost in digital environments, which are highly predictable and primarily based on keyword search. To what extent serendipity is reduced in digital search is debatable. Nonetheless, this perception of loss directly affects how scholars, and in particular humanities scholars, adopt and use digital tools. A study of historians’ research practices suggests that these scholars are skeptical of conducting their research exclusively in digital environments because they lack the ability to encounter key resources (primary and secondary materials) that could have a major impact on their research findings 3. In this study, the authors also found that historians were willing to experiment with digital tools, if these could recreate opportunities for encountering information. Hence, scholars perceive the discovery of resources, browsing, and chance encountering as central elements of their research practice that can, and need to, be supported online.
Outside of academia, a number of tools have emerged that try to introduce serendipity into the online experience. What is less clear from the literature is how to best support this process, as a wide range of approaches have been suggested ranging from interactions in social media 4, exploration in non-search related digital environments, and information search in digital environments 5. The approach most commonly taken is to introduce serendipity into the online information search experience; this is often done by introducing some element of randomness and thereby reducing the predictability of search results. An example of this approach is BananaSlug, which returns random results to a search query. Other approaches include reversing or modifying the ranking in which search results are presented online 6. This would draw attention to a different set of items because users commonly tend to investigate only the first and perhaps second pages of search results. All of these approaches aim at broadening “the search space, promoting encounters with items that might not, under existing algorithms, come to the attention of the user”. While the majority of digital tools aimed at promoting serendipity have emerged outside of the humanities, a series of tools have recently been developed with humanists in mind. These tools have garnered considerable attention in the field, but it remains unclear what element of serendipity they support. Part of the problem is the fact that the concept of serendipity is elusive 7 and difficult to pinpoint. Reducing it to the introduction of randomness, however, does not seem to be the best way to move forward, even though it is the one most commonly utilized. A second problem, and perhaps more concerning, is that scholars need to first understand that serendipity is not a one-dimensional concept but, rather, includes a number of related facets, which need to inform tool design and implementation. The present paper critically examines four DH tools that encourage serendipitous results and attempts to place these within current models of serendipity:
Serendip-o-matic (http://serendip-o-matic.com/)
Trove News Bot (https://github.com/wragge/trovenewsbot)
Mechanical Curator (http://mechanicalcurator.tumblr.com/)
DP.LA Bot (https://github.com/wragge/trovenewsbot).
As a basis for this examination, we have established the main facets of serendipity obtained from the extensive literature in Library and Information Science (LIS). Through this comparative study, we aim to accomplish two goals. First, there is a gap in understanding exactly what aspects of serendipity digital tools support. By merging the literature in LIS with tool design in DH, we hope to create greater clarity as to what aspects have been supported. Second, the results of the study will determine what future developments are needed to better support the work of humanists in digital environments.
Interviews with 20 history scholars inform the first phase of this study. These scholars indicated a desire for serendipitous encounters with material to remain a part of their research process after the integration of digital texts to their work. After discovering that historians were seeking new methods of information acquisition online, further interviews were conducted with DH scholars to see what methods they were using to browse information. The results of these two sets of qualitative data will be discussed and used to demonstrate a need for a serendipity tool within the DH community.
The second phase of this research is an in-depth exploration of the four information-discovery tools listed above. These tools will be examined in terms of Erdelez’s (2004) model of information encountering outlined below 8. After analyzing each tool carefully, follow-up interviews will be conducted with the creators of each tool to discuss their intentions for and reflections upon, the use of the tool by humanist scholars.
A wide range of models of serendipity have been developed relying on very different data sets and assumptions. Erdelez (2004) developed one of the first models and emphasized the experience of information encountering (IE), which she defined as a type of opportunistic acquisition of information. Erdelez’s (2004) utilized an experimental setting, where participants were asked to look for information related to a foreground problem and the researcher observed how they would react to information related to a background problem. As part of her model, Erdelez (2000) identified five elements:
noticing: the perception of encountered information;
stopping: the interruption of the initial information seeking activity;
examining: the assessment of usefulness of the encountered information;
capturing: the extraction and saving of the encountered information for future use;
and returning: the reconnection with the initial information seeking task.
In Erdelez’s (2004) model, a person is primarily focusing on the information needs related to a foreground problem. However, cues related to another problem, a background problem, may catch the person’s attention. If the person notices the cues and stops to examine the newly encountered information, then there is an opportunity for discovering unexpected resources. It is this process of noticing, examining, and capturing that digital tools try to emulated or support.
Each of the four tools reflects one or more aspects of the serendipitous process as outlined by Erdelez, (see Table 1).
Noticing Stopping Examining Capturing Returning
Serendip-o-matic ✓ ✓
Trove News Bot ✓ ✓ ✓ ✓
Mechanical Curator ✓ ✓ ✓ some
DP.LA Bot ✓ ✓ some
The tools listed above, with the exclusion of Serendip-o-matic, select materials randomly and then present these to followers on Twitter. Randomness, as we know, does not necessarily mean that serendipity will occur. These tools all provide links to places that users can go to receive extraneous materials in the hopes that something of interest will come their way.
Interestingly, the capturing element of these tools seems to be largely disregarded. Considering the DH community is acutely aware of the need to instantly capture digital documents and the associated metadata with citation tools (Zotero), none of the examined tools includes this element in their framework. This leads the authors to conclude that future design could focus on this element of capturing information, and could introduce a method that allows for the saving of documents so users can retrace their footsteps after returning to the initial task or foreground problem. Our critical analysis of various DH tools and how they support serendipity provides opportunity to further enhance these tools as well as a means to design additional tools that can impact the research practices of humanities scholars.
References

1. CHMN. (2013). Serendip-o-matic: Let your sources surprise you. One Week | One Tool. Retrieved October 31, 2013, from serendip-o-matic.com/about
2. Sherratt, T. (2013). Conversations with Collections. discontents. Retrieved October 31, 2013, from discontents.com.au
3. Martin, K., & Quan-Haase, A. (2013). Are e-books substituting print books? Tradition, serendipity, and spportunity in the adoption and use of e-books for historical research and teaching. Journal of the American Society for Information Science and Technology. 64(5), 1016-1028.
4. Bogers, T., & Björneborn, L. (2013). Micro-serendipity: Meaningful coincidences in everyday life shared on Twitter. In Proceedings of iConference (pp. 196–208).
5. Quan-Haase, A., Burkell, J., & Rubin, V. L. (n.d.). The role of serendipity in digital environments. In Encyclopedia of Information Science and Technology. IGI Global.
6. Jansen, B. J., Spink, A., & Saracevic, T. (2000). Real life, real users, and real needs: A study and analysis of user queries and on the web. Information Processing & Management, 36(2), 207–227.
7. Merton, R. K. (2004). The travels and adventures of serendipity: a study in sociological semantics and the sociology of science. Princeton, N.J: Princeton University Press.
8. Erdelez, S. (2004). Investigation of information encountering in the controlled research environment. Information Processing & Management, 40(6), 1013–1025. doi:10.1016/j.ipm.2004.02.002
9. Erdelez, S. (2000). Towards understanding information encountering on the Web. In Proceedings of the 63rd annual meeting of the American Society for Information Science (pp. 363–371). Medford, N.J.: Information Today.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO