Collaborative Indexing of Cultural Resources: Some Outstanding Issues

Jonathan Furner; Martha Smith; Megan Winget

Authorship

1. Jonathan Furner

OCLC Online Computer Library Center
2. Martha Smith

Information School - University of Washington
3. Megan Winget

School of Information and Library Science - University of North Carolina at Chapel Hill

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

In this paper, we argue that, although collaborative
indexing of cultural resources shows promise as a method of improving the quality of people’s interactions with those resources, several important questions about
the level and nature of the warrant for basing access systems on collaborative indexing are yet to receive full consideration by researchers in cultural informatics.
Specifically, we suggest that system designers look
to three cognate fields---classification research,
iconographical analysis, and annotation studies---for pointers to criteria that may be useful in any serious
estimation of the potential value of collaborative
indexing to cultural institutions.
Collaborative indexing (CI) is the process by which the resources in a collection are indexed by multiple people over an ongoing period, with the potential result that any given resource will come to be represented by a set of descriptors that have been generated by different people. One community of researchers that has demonstrated heightened, ongoing interest in collaborative indexing
is that which is active in cultural informatics, and
specifically in the design and development of systems that
provide patrons of cultural institutions such as libraries, archives, and museums with networked access to digital representations of the objects in institutions’ collections (see, e.g., Bearman & Trant, 2005).
Justifying the collaborative-indexing approach
At a simple level, the quality of any cultural
information system (or any component of a
system such as an indexing mechanism) may be evaluated
by considering its performance in relation to three
imperatives, each of which corresponds to a separate aspect---cultural, political, economic---of the complex mission of contemporary cultural institutions.
1. How effectively does the system allow its users to find the resources in which they have an interest, and to derive optimal value from those resources once found? In order that patrons derive positive value from their experience of interacting with the objects preserved in institutions’ collections, they should be actively supported in their efforts to
develop an interpretive understanding of those objects
and the contexts in which they were produced---both by being given high-quality access to information about (including visual images of) objects, and by being encouraged to express their understanding and share their interpretations with others.
2. How broadly and inclusively does the system serve
all sections of its parent institution’s public?
Managers of cultural institutions commonly express a concern that the opportunity to derive positive
value from the services offered by those institutions
should be distributed justly among, and enjoyed equally by, members of all social groups.
3. How well does it do at delivering maximal quality at minimal cost? The institution which consistently
allows the costs incurred in the collection,
preservation, interpretation, and provision of access to its resources to exceed the value of the benefits enjoyed by its public will not survive for long.
Justifications of the CI approach tend to proceed by drawing attention to the ways in which it can be viewed as responding to one or other of these three imperatives.
Proponents commonly highlight several distinctive
characteristics of CI in this regard:
(a) CI is distributed: No single person is required to index all resources; no single resource needs to be indexed by all people.
(b) CI is cheap: Indexers typically volunteer their efforts at no or low cost to collection managers.
(c) CI is democratic: Indexers are not selected for their expertise by collection managers, but are self-selected according to indexers’ own interests and goals.
(d) CI is empowering: People who might in the past have been accustomed to searching databases by
attempting to predict the descriptors used by “experts”
are now given the opportunity to record their own knowledge about resources.
(e) CI is collaborative: Any given record is potentially representative of the work of multiple people.
(f) CI is dynamic: The description of a given resource
may change over time, as different people come
to make their own judgments of its nature and
importance.
All of these characteristics are relevant, in various
combinations and to various degrees, to any estimation of the success with which CI-based systems are likely to meet the cultural, political, and economic imperatives
described above. But each additionally raises issues of a more problematic nature than is typically admitted.
Given the distributed nature of CI, for example, how can it be ensured that every resource attracts a “critical mass” of index terms, rather than just the potentially-quirky
choices of a small number of volunteers? Given the
self-selection of indexers, how can it be ensured that they are motivated to supply terms that they would expect
other searchers to use? Empirical, comparative
testing of the utility of different prototypes---focusing, for
example, on forms of interface for elicitation of terms, or on algorithms for the ranking of resources---is undoubtedly
an essential prerequisite for the future development of successful CI-based systems (Bearman & Trant, 2005). But it is also important, we argue, that the results of prior
research in a variety of cognate fields be taken into
account when addressing some of the more problematic issues that we have identified.
Classification research
In classification research, for example, it has long been argued that indexers and searchers benefit from having the opportunity to browse or navigate for the terms or class labels that correspond most closely to
the concepts they have in mind, rather than being required to specify terms from memory (see, e.g., Svenonius, 2000). Indexer--searcher consistency, and thus retrieval
effectiveness, can be improved to the extent that a
system allows indexers and searchers to identify descriptors
by making selections from a display of the descriptors that are available to them, categorized by facet or field, and arranged in a hierarchy of broader and narrower terms so that the user can converge on the terms that they judge to be of the most appropriate level of specificity.
Current implementations of CI-based systems shy away from imposing the kind of vocabulary control on which classification schemes and thesauri are conventionally founded: the justification usually proceeds along the lines that indexers should be free, as far as possible, to supply
precisely those terms that they believe will be useful
to searchers in the future, whether or not those terms have proven useful in the past. Yet it remains an open question as to whether the advantages potentially to be gained from allowing indexers free rein in the choice of terms outweigh those that are known to be obtainable
by imposing some form of vocabulary and authority
control, by offering browsing-based interfaces to
vocabularies, by establishing and complying with
policies for the specificity and exhaustivity of indexing,
and by other devices that are designed to improve
indexer--searcher consistency.
Theories of iconographical interpretation
Another related subfield of library and information science is that which is concerned with the effective
provision of subject access to art images (see, e.g.,
Layne, 1994), and commonly invokes the theory of
iconographical interpretation developed by the art
historian Erwin Panofsky (Panofsky, 1955). Current
implementations of CI-based systems for art museums focus on eliciting generic terms for (what Panofsky calls) pre-iconographic elements, i.e., pictured objects, events, locations, people, and simple emotions---the assumption apparently being made that such terms are those that will be most useful to searchers (Jörgensen, 2003). There is very little evidence supplied by studies of the use of
art image retrieval systems, however, to suggest either that pre-iconographic elements are indeed what non-
specialists typically search for, or that generic terms lead non-specialist searchers to what they want. We do know from analyses of questions that visitors ask in museums that non-specialists typically do not have the specialist vocabulary to specify precisely what they are looking for (see, e.g., Sledge, 1995). This does not necessarily mean,
however, that searchers always default to using
pre-iconographic terms whenever they wish to get at more complex themes and ideas, nor that searches for
higher-level elements using pre-iconographic terms will be successful. Further studies of the question-
formulating and searching behavior of non-specialist art viewers and learners are clearly necessary.
Annotation studies
Researchers in the human--computer interaction (HCI) community are continuing to develop an agenda for work in the emerging subfield of annotation studies (see, e.g., Marshall, 1998), focusing on ways to improve interfaces that support annotation behavior of a variety of kinds, in a variety of domains. In this
research, an annotation is commonly considered as
evidence of a reader’s personal, interpretive engagement
with a primary document---a form of engagement that is not so different from that which cultural institutions seek to encourage in their patrons. A cultural annotation system that allowed patrons not only to supply their own descriptions of an institution’s resources, but also to add comments and to build communities around personal collections, could be envisaged as a vital service that
would help patrons interact with and interpret those
resources, largely outside the authority and control of
curators and other specialists. It remains an open question
as to whether a system that allows patrons to supply their
own descriptions of institutions’ resources is most
appropriately evaluated as a tool for creating and accessing
personal annotations, as a tool for sharing and accessing collaborative descriptions, as a retrieval tool pure and simple, or some combination of all three. Unfortunately, our understanding of the purposes and intentions of users of CI-based systems is still spotty, and further research in this area is necessary.
Conclusion
In general, we suggest that particular care needs to be taken by cultural institutions in examining and
adjudicating between potentially conflicting motives for
inviting patrons to provide basic-level descriptions of resources. Classification research shows us that simple
assignment of single-word descriptors unsupported by vocabulary control or browsable displays of the semantic
relationships among descriptors is not enough to guarantee effective access; theories of iconographical interpretation
demonstrate how important it is that non-specialist
indexers should not be led to assume that listing what one sees is somehow all that art-viewing and meaning-making involves; and annotation studies encourage us to consider how cultural institutions may go beyond simple
systems for collaborative description, and develop more-sophisticated systems for truly collaborative annotation that support deeper levels of interpretation and learning.
References
Bearman, D. and Trant, J. (2005). Social terminology enhancement through vernacular engagement: Exploring collaborative annotation to encourage
interaction with museum collections. D-Lib
Magazine, 11(9). http://www.dlib.org/dlib/
september05/bearman/09bearman.html (accessed 7 November 2005).
Jörgensen, C. (2003). Image Retrieval: Theory and
Research. Lanham, MD: Scarecrow Press, 2003.
Layne, S. S. (1994). Some issues in the indexing of images. Journal of the American Society for Information Science, 45: 583-588.
Marshall, C. C. (1998). Toward an ecology of hypertext annotation. In Proceedings of ACM Hypertext ’98 (Pittsburgh, PA, June 20-24, 1998), pp. 40-49. New York: ACM Press.
Panofsky, E. (1955). Iconography and iconology: An introduction to the study of Renaissance art. In
Meaning in the Visual Arts. New York: Doubleday.
Sledge, J. (1995). Points of view. In D. Bearman (ed), Multimedia Computing and Museums: Selected
Papers from the Third International Conference on
Hypermedia and Interactivity in Museums (ICHIM ’95 / MCN ’95; San Diego, CA, October 9-13, 1995), pp. 335-346. Pittsburgh, PA: Archives &
Museum Informatics.
Svenonius, E. (2000). The Intellectual Foundation of Information Organization. Cambridge, MA: MIT Press.

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ACH/ALLC / ACH/ICCH / ADHO / ALLC/EADH - 2006

Hosted at Université Paris-Sorbonne, Paris IV (Paris-Sorbonne University)

Paris, France

July 5, 2006 - July 9, 2006

151 works by 245 authors indexed

The effort to establish ADHO began in Tuebingen, at the ALLC/ACH conference in 2002: a Steering Committee was appointed at the ALLC/ACH meeting in 2004, in Gothenburg, Sweden. At the 2005 meeting in Victoria, the executive committees of the ACH and ALLC approved the governance and conference protocols and nominated their first representatives to the ‘official’ ADHO Steering Committee and various ADHO standing committees. The 2006 conference was the first Digital Humanities conference.

Conference website: http://www.allc-ach2006.colloques.paris-sorbonne.fr/

Series: ACH/ICCH (26), ACH/ALLC (18), ALLC/EADH (33), ADHO (1)

Organizers: ACH, ADHO, ALLC

Collaborative Indexing of Cultural Resources: Some Outstanding Issues

1. Jonathan Furner

2. Martha Smith

3. Megan Winget

ACH/ALLC / ACH/ICCH / ADHO / ALLC/EADH - 2006