Accessing, navigating, and engaging with high-resolution document image collections using Diva.js

paper, specified "short paper"
Authorship
  1. 1. Andrew Hankinson

    McGill University

  2. 2. Laurent Pugin

    Université de Fribourg

  3. 3. Ichiro Fujinaga

    McGill University

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

High-resolution page images are providing digital humanities researchers with unprecedented visual access to historically significant works located around the world. As libraries and archives continue digitizing their historical document collections, they are increasing the quality and resolution of their document imaging systems and producing images that, while unprecedented in their clarity and detail, are inconvenient to navigate and manipulate in traditional browser-based environments. Users find themselves waiting for large PDF files to download, or clicking endlessly through thumbnail after thumbnail to find a page image that contains materials of interest to them. These methods of document navigation and viewing have been in place since the infancy of the web browser, and are needlessly awkward given advances in creating asynchronous web applications.
The most common interface paradigm for browsing images online is the ‘image gallery’. To illustrate this type of interface we will use the example of the Early English Books Online (EEBO) interface as one with which some readers may be familiar. In the EEBO image viewing interface, users navigate a document as if it were a series of independent images, or image gallery, viewing small thumbnails that, while efficient for downloading to a browser, make it impossible to see the actual content of the page (Figure 1). To examine any single page, the user must click on an image, bringing up a second view of the page optimized for viewing in the browser, but may not be usable for close examination if the text on the page is too small. Should a user wish to examine any part of a page in particular detail, there may be an option to download a larger, high-resolution image, but the user must wait for this large image to download to their browser, which may take several minutes depending on the speed of the network connection and the size of the image. If the user waits for the full quality image to download but wishes to continue browsing the document on the next page they must traverse back to the smaller thumbnails and start the process again.

Fig. 1: Viewing page thumbnails in the EEBO collection.
An alternative to the image gallery mode of interaction is the use of a browser-based book reader component. Several of these systems are available for managing user interactions with page images. Perhaps the most well-known purpose-built web-based document viewer is BookReader by the Internet ArchiveI1. Developed as part of the Open Library project, this software presents the user with a book metaphor, inviting them to 'turn' the pages of a book. While this provides a useful alternative to the image gallery mode of viewing, the IA BookReader requires that each page is represented by a complete image file, so zooming in and viewing a page in detail requires the user to wait while the entire image is downloaded—which can be slow and cumbersome depending on the size and resolution of each image. This is also true for PDF-based document image display, where a user is forced to accept a trade-off between viewing low-resolution versions of page images, or waiting for extremely large PDF files to download before they can view any of the pages.
To optimize viewing large, high-resolution documents in a web browser we developed the Diva document image viewer. Diva features several methods for managing user interactions with document page images. Users interact with the full document by scrolling the document, as they might with a PDF file. However, in Diva all page images are composed of smaller tiles. These tiles are of a fixed size (256x256 pixels), and all pages (and their tiles) that are outside of the user's viewport are not downloaded to the browser. This creates an ‘instant-on’ effect to viewing a document, since the user does not have to wait for the entire document to download, but just a small portion. As a user scrolls, new tiles are downloaded on-demand. To view higher or lower resolution page images, users can 'zoom' between resolutions. Zooming in and out on an image will download just the portion of the page that fits on the users’ screen (Figures 2 and 3).

Fig. 2: A zoomed-out view of a manuscript page.

Fig. 3: Zoomed-in detail of the lower-left corner of the page shown in Figure 2.
With the ubiquity of mobile devices it is important to ensure Diva functions on low-memory systems, like the iPad or iPhone. Diva.js uses several methods unique to document image viewers for optimizing memory usage and display. While a document may be several hundred pages long, Diva keeps just three pages in memory at any given point in time, dynamically adding and removing page elements from the browser as the user scrolls. This creates a fast and efficient browsing system for both mobile and desktop devices.
Furthermore we have built a number of image manipulation tools into Diva.js that allow users to engage with a document. Many documents, especially older manuscripts, feature faded inks or text that is written perpendicular to the captured page orientation (e.g., marginalia). Using Diva, users can manipulate brightness, contrast, and page rotation in their browser via an unique set of HTML5-based image manipulation tools, allowing them to enhance faded inks or rotate a page to read margin notes or tables (Figure 4). Other viewers that offer this functionality manipulate the image on the server and then send it back to the client. This is a high-latency operation. With browser-based image manipulation users can see the results of their changes immediately.

Fig. 4: Manuscript page (in Arabic) rotated 90° to view perpendicular text on flyleaf. The controls on the left allow the user to manipulate brightness, contrast, rotation, zoom, and individual RGB colour channels.
We have used Diva.js as the presentation layer of a document image search system. When a user searches for a given word or phrase, the results are presented in situ on page images, highlighting the exact location on each page where their result occurs.
All components of Diva are available as free and open-source software, available on GitHub 2. Diva may be integrated into existing digital library systems. On the server-side, Diva requires the IIP Image Server 3, a giga-pixel image server that serves the page image tiles, and a standard web server, such as Apache or NginX. Document images can be encoded as either multi-resolution JPEG2000, or pyramid TIFF files. The JavaScript components of Diva.js will work in any modern web browser. These components manage the asynchronous communication process between the user's browser, the web server, and the IIP Image Server. We have also built in a comprehensive API and plugin system that provides ‘hooks’ into the page loading and image manipulation systems.
In our presentation we will provide a demonstration of Diva.js and an overview of its background and development history. We will discuss several case studies where we have employed Diva.js for viewing and searching large historical document image collections, where it is used to display page images captured at over 1,200 PPI. Zooming in on images at these resolutions allow users to view individual brush strokes, paper detail and condition, to view details of manuscript illuminations, and several other important document factors that are lost in lower-resolution image displays. Finally we will demonstrate several new features for highlighting and annotating places of interest on page images and integrating page images with the output of optical character recognition software.
References

1. The Open Library. 2013. Internet Archive Bookreader. openlibrary.org/dev/docs/bookreader (accessed 7 March 2014).
2. Distributed Digital Music Archives and Libraries. 2014. Diva.js github.com/DDMAL/Diva.js (accessed 7 March 2014).
3. Pillay, R., and D. Pitzalis. (2008). IIP Image Server. iipimage.sourceforge.net (accessed 7 March 2014).

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2014
"Digital Cultural Empowerment"

Hosted at École Polytechnique Fédérale de Lausanne (EPFL), Université de Lausanne

Lausanne, Switzerland

July 7, 2014 - July 12, 2014

377 works by 898 authors indexed

XML available from https://github.com/elliewix/DHAnalysis (needs to replace plaintext)

Conference website: https://web.archive.org/web/20161227182033/https://dh2014.org/program/

Attendance: 750 delegates according to Nyhan 2016

Series: ADHO (9)

Organizers: ADHO