Googling Google Books: Integrated use of Fragmentary Information Display in Google Book Preview of Electronic Books

  1. 1. Kirsten C. Uszkalo

    Simon Fraser University

  2. 2. Teresa Dobson

    University of British Columbia

  3. 3. Stan Ruecker

    University of Alberta

In previous articles and panels, the authors of this paper
have argued that human/textual interaction is an essential
missing aspect of the electronic book reading experience.
In order to fulfill the needs of the bibliophile, for
whom the touch, feel, portability, and engagement with
the paper book is part of the pleasure of reading, the electronic
book needs to be a tactile, multiplatform device
(Ruecker & Uszkalo 2007, 2008). Our investigation led
us to look at the user interface of current single platform
electronic reading technologies such Sony PRS 505
and Kindle and the more multifaceted iLiad v2 reader/
writer. We argued that e-paper makes these devices more
user-friendly and offers the potential for a lower environmental
footprint then paper books (de Grancy 2008),
but noted that their lack of positive form, function, and
feel when coupled with their proprietary software and
the cost of e-books remained major stumbling blocks to
the overall adoption of electronic reading devices (Sottong
2008, Milliot 2008a, 2008b). We concluded that the
numerous multifunction platforms used for electronic
reading, such as computer screens, phones, and iPods
will keep a single-function electronic book reader as a
specialty item.
In this phase, we turn our attention to a fuller consideration
of one of those platforms: Google Books. Since
Google Books has digitized and displays copyright material,
it provides the user with incomplete views of the
text. Our research question asks how does the underlying
structure of Google Books’ display of snippets, incomplete chapters, and limited views of text affect how the
digitally efficacious student navigates through these documents
and how does it affect her reading and research
experience, comprehension, and synthesis of material?
The academic library’s widespread adoption of digital
resources has created the need to reconsider how students
are conducting research. Undertaking universitylevel
research assignments has always been a frustrating
skill to acquire for new students. Current critical consensus
suggests the necessity of teaching specific independent
research strategies to university students to help
them navigate and synthesize the massive amount of resources
available to them (Holz et al., 2008, Femster &
Gray 2008, Polack-Wahl & Anewalt 2006). In ebrary’s
Global Student E-book Survey, released June 2008, students
ranked e-books equivalent with their print counterparts
as “trustworthy” and acknowledged their use in
research assignments. Research using electronic documents
provides the opportunity to negotiate broad spans
of text; Lui (2005) argues that “screen-based reading
behavior is characterized by more time spent on browsing
and scanning, keyword spotting, one-time reading,
non-linear reading, and reading more selectively, while
less time is spent on in-depth reading, and concentrated
reading.” In terms of conducting academic research,
students can sift through information and collect numerous
snippets of information. Many students are already
used to engaging with multiple internet applications
like audio and video streams, instant messaging, and
file sharing to access information inside and outside the
educational setting (McGreal & Elliot 2004). Likewise,
they use functions like Twitter, texting, FaceBook, RSS
feeds, keyword searches and indices to find and absorb
small fragments of information in meaningful ways. The
challenge emerges in synthesizing these snippets of information
outside of their original context. However
skilled the student might be in amassing information, the
information they gather using Google Books often lacks
context within the larger work. The student must synthesize
the snippets into their own linear arguments without
having read or followed the complete arguments in the
texts they are mining for quotes. Eshet-Alakali argues
that to create “original academic work with the aid of
digital techniques for text reproduction, requires scholars
to master a special type of literacy” (98). They call
this reproduction literacy, which is “the ability to create
a meaningful, authentic, and creative work or interpretation,
by integrating existing independent pieces of information,”
which is a learned skill (98).
Although researchers have yet to see empirical evidence
that students can translate reproduction literacy into successful
academic argumentation, the students perceive
they can. As such, the movement towards using Google
Books’ limited previews to quickly search specific ideas
across texts not otherwise on hand appears to be a logical
extension of existing information finding practices, acting
as a center point between authoritative text and quick
nonlinear thinking and research strategies. Students are
trained to see books as an authoritative source, and they
translate the affordance of the printed book in terms of
St. Amant’s relationships or properties of relationships
and perceived properties into their interaction with
Google Books as part of a continued, trusted relationship
(1998). In terms of internet efficacy (Eastin & LaRose,
2000), and the positive affordances of online learning
(Anderson 2004), the dynamic and fragmentary interaction
provided by Google Books’ platform seems familiar
and would not be seen as detrimental to this group of
users. However, users may consider just how much their
textual interaction is being controlled and delineated by
Google Books and how those limitations affect knowledge
gathering and comprehension.
The Canadian Association of Research Libraries Copyright
Committee argued in their recent survey, “Task
Group on E-Books” that there “is a danger that research
libraries are adding e-books to their collections using
agreements that significantly reduce users’ rights” including
textual access and reproduction. Issues of copyright
are what create different experiences for readers –
providing whole texts or stripping access down to sound
bites. According to Google Book Search help, for books
which are under copyright, and are not part of their partner
programs, users are only able to see “basic information
about the book, similar to a card catalog, and, in
some cases, a few “snippets” of “sentences of [their]
search terms in context.” These keyword results allow
students to feel critical engagement with texts, providing
a singular index across an otherwise flat and non-transferable
platform. In this way information is displayed in
vastly different amounts, dependant on copyright. On
some occasions lines of text appear as little scraps of
digital paper, or “snippets”; on other occasions, previews
allow readers to view a certain number of pages, or read
a random sequence of pages at a time, often missing a
few pages in between.
Google Books essentially operates under a shroud of secrecy,
despite partnerships with numerous universities
such Cornell and Columbia, providing no practical information
about its underlying code. Despite this, there
a few things we can assume about its digital creation
process based on common digitization techniques. It is
likely that, as part of the Google Books digitization process,
two things are created: the scanned image, which
is displayed to the user, and an XML-encoded text file generated by an OCR (Optical Code Recognition) application,
which is not seen. This XML-encoded text file
contains all of the words in the document, but also stores
positional information for each word (where each word
appears physically on the page, including its height and
width). When users search in Google Books, they are
not searching the scanned image, but rather, are searching
the XML-encoded text file. Google Books then combines
the scanned image with the positional information
in the XML-encoded text file, and fakes the text highlighting
by drawing a transparent yellow box over the
word in the scanned image.
Textual engagement in Google Books is therefore radically
altered when the display truncates and limits the
amount of text a reader can access. The text is at the
same time both dynamic and static, neither simply file
nor page, and the reader is neither simply a researcher
nor a casual reader. Text display and reader interaction
remain in flux, dependent on how the text is being displayed
and how much the reader can view and search.
The reader might find her experience as constructed in
small glimpses, large text chunks, or whole chapters,
but cannot know how the text will be displayed until she
loads it. Likewise, she cannot know when a blank page
will remove part of the text and if subsequent pages will
appear. However, the successful keyword hit across an
otherwise blanked out book is seen to represent a successful,
albeit fragmentary and miniaturized, search.
Digitally savvy university students have begun to use
Google Books as a research tool. Although this platform
provides fragmentary keyword searches and limited
previews of text blocks, which radically alter the
user’s reading experience, anecdotal evidence suggests
students who are sufficiently familiar with short text display
communication feel they get enough information
from the limited preview to use these texts as sources
for research. Although Google Books’ textual display
has the potential to radically unsettle the reading experience
it appears as sufficient for entry-level engagement
with students who still see these displays as offering the
authority of the book and the ease of use of the electronic
text. More investigation of the way in which the Google
Books’ display facilitates research interactions among
university students is needed.
