Collex: Facets, Folksonomy, and Fashioning the Remixable web

Authorship
  1. 1. Bethany Nowviskie

    Applied Research in Patacriticism - University of Virginia

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

Collex is an online toolset designed to aid students and
scholars working in networked archives and federated
repositories of humanities materials: a sophisticated
COLLections and EXhibits mechanism for the semantic web.
It allows users to search, browse, annotate, and tag electronic
objects and to repurpose them in illustrated, interlinked essays
or exhibits. By saving information about user activity (including
the construction of annotated collections and exhibits) as
"remixable" metadata, the Collex system writes current practice
into the scholarly record and permits knowledge discovery
based not only on predefined characteristics or "facets" of digital
objects, but also on the contexts in which they are placed by a
community of scholars. Collex builds on the same semantic
web technologies that drive MIT's SIMILE project and social
bookmarking systems like Connotea and Zotero, but it also
brings folksonomy tagging to trusted, peer-reviewed scholarly
archives and features an integrated publication system. This
exhibits-builder is analogous to high-end digital "curation"
tools currently affordable only to large institutions like the
Smithsonian. Collex is free, generalizable, and open source and
is presently being implemented in a large-scale pilot project
under the auspices of NINES.
Collex is constructed with pragmatic scholarly needs in mind,
and under the assumption that "the general field of humanities
education and scholarship will not take the use of digital
technology seriously until one demonstrates how its tools
improve the ways we explore and explain our cultural
inheritance – until, that is, they expand our interpretational
procedures" (McGann, Radiant Textuality xii; my emphasis).
Collex facilitates primary interpretive gestures of exploration
and explanation in a broad and socially-networked manner, and
aims to form a locus for further expansion of interpretive
methods in digital humanities.
The first formal iteration of Collex (released in February 2007)
federates more than 60,000 digital objects of 19th-century
literature, art, culture and criticism from the most prominent
and acclaimed online journals, archives, and repositories in the
field. This pilot project forms the core of NINES, the Networked
Infrastructure for Nineteenth-century Electronic Scholarship,
a trans-Atlantic federation of scholars and of peer-reviewed
primary and secondary materials constituting a federated
collective. Endorsed by the NINES steering committee and
under development for the past year at ARP, the University of
Virginia's Applied Research in Patacriticism lab, Collex is both
the central clearing-house for NINES and the interpretive hub
around which we hope a vital community of scholars and
students will coalesce.
Humanists eager to develop new ways to integrate and explore
digital works currently lack crucial institutional and technical
resources. Even the best models that scholars of (for instance)
nineteenth century literature and culture now follow and imitate
— the Whitman Archive, Romantic Circles, the Rossetti Archive,
the William Blake Archive — are stand-alone projects that can
only be loosely integrated through web browsers, even when
shared through OAI protocols. As a consequence, what you see
now on the web is what you get: an agglomeration of sites and
projects whose content is atomized and whose scholarly and
educational value is indeterminate. While it is possible for
tech-savvy scholars, using ad-hoc tools and methods, to produce
and distribute annotated, re-organized, or selected versions of
existing online resources, they presently lack coordination
within a peer-reviewed digital publishing environment. Because
of this, their productions — personal web pages and online
course packets — are difficult to maintain, are not readily
interoperable or standards-compliant, and are easily dismissed
as heterogeneous grab-bags of links. NINES was founded to
work against this debilitating situation.
The inherent complexity of available resources is a further
obstacle to the penetration digital humanities into the
disciplines. Collex is designed to aid humanities scholars doing
research in complex digital collections like the Rossetti Archive
(its initial test case) or within federated research environments
like NINES. Such environments often stymie their users through
the sheer quantity of information made available to them in
top-level tables of contents, sitemaps, and idiosyncratic search
engines. Our tool operates under the assumption that the best
paths through a complex digital resource are those forged by
use and interpretation. A Collex approach works to assist
scholars in recording, sharing, and building on the interpretive
purposes to which they put their online teaching and research
environments.
Collex uses semantic web principles and technologies to explore
and develop the research potential of the digital scholarship
aggregated in NINES. Two critical concepts embodied in a
NINES environment shaped by the Collex application fall under
the rubrics widely known as "faceted classification" and
"folksonomy." Facets and folksonomies structure an approach
to descriptive metadata. They generate an evolving interface
between the fully-integrated peer-reviewed electronic resources that constitute NINES and the user communities that re-imagine
NINES content through interpretation, contextualization, and
critical and creative re-fashioning.
"Full integration" means that each of our NINES-participating
resources has contributed a package of metadata representing
all of the digital objects they wish to make browseable,
collectible, and available to users for re-purposing within
Collex. An important innovation of Collex lies in the way these
objects are defined by their contributing editors. Collex uses a
Dublin Core flavor of RDF, the "resource description
framework" of the semantic web, to define collectible "objects"
without limiting them to their expression as web pages. Where
other social bookmarking tools are designed to allow collection
and annotation of whole web pages, Collex allows contributors
of resources to make finer-grained distinctions, and users of
the system to build collections and exhibits more attuned to the
patterns of attention in humanities scholarship.
A clear example of interpretive modeling through object
definition is the Collexrepresentation of a book of poetry in the
Rossetti Archive. Using XSLT transformations, we have created
RDF metadata for intellectual and material objects at differing
levels in this book. One RDF object (typed as a secondary
resource, with supplemental genre and date identifiers)
expresses the editor's commentary on the book as a concept.
Another object, also articulated in metadata, expresses one
particular edition of the book. Within that high-level expression,
each page of the book has been shared with Collex users as a
collectible object, as has each poem on each page. Such fine
disambiguation ensures that Collex users can locate, annotate,
and exhibit objects specifically suited to the scholarship they
wish to perform – whether their attention is focused on
bibliographic, social, or textual matters. It also ensures that
archive maintainers have the fullest control, in the Collex
environment, over the use of their intellectual property and the
artifacts they minister.
Because RDF objects share a common (and relatively simple)
metadata scheme, they are discoverable through "facets" in the
Collex search and faceted-browsing interface. Faceted
classification is a non-hierarchical means of expressing
ontological relationships. Any given object will share a number
of facets with other objects – common dates, genres, authors,
etc. Exposing these facets makes it possible not only for users
to manually "drill down" into certain categories or explore
lateral relationships, it also opens possibilities for algorithmic
serendipity in research. In other words, Collex can exploit
formally-expressed facets to offer more options and avenues
to users interested in a particular object: "more like this" – more
objects in the repository sharing one or more attributes with a
researcher's subject of attention. Even more interesting is the
ability of Collex to record and analyze user activity, and to
translate the products of user interaction into RDF objects within
the system itself. In this way, in addition to "more like this,"
Collex can suggest to recent collectors of a particular object
that they view the published collections and exhibits into which
other users have placed the object, or objects like it. Because
this content can be expressed as subscription-based RSS feeds,
a web service, or an API, it is possible for the maintainers of
scholarly resources to patch into Collex directly from their
individual web page or listserv interfaces, offering information
about user annotations and re-mediations for any given object
without requiring users to visit Collex at all.
All Collex activity takes place within the ordinary web-browsing
environment that scholars presently use to access digital
resources, and will require nothing in the way of plugins or
downloads. The overhead (in terms of initial metadata
production) for contributors of resources to the federated
collections in which Collex can operate has also been kept
purposely low, and is thoroughly compatible with Open
Archives protocols. We predict that both of these factors –
combined with the strong endorsement and example of NINES
– will facilitate the adoption of Collex into day-to-day practices
of humanities scholars in networked research and publishing
environments.
Screenshots
A Collex sidebar list view (user-collected objects in the "visual art" genre)
with the same constraint in the faceted browser A different constraints set, and a detail view of one object (for tagging,
annotation, and knowledge discovery) in the sidebar
Bibliography
"Access to the Literature: The Debate Continues." Nature Web
Focus (2004). <http://www.nature.com/nature/f
ocus/accessdebate/>
Broughton, V. "Faceted Classification as a Basis for Knowledge
Organization in a Digital Environment; the Bliss Bibliographic
Classification and the Creation of Multi-Dimensional
Knowledge Structures." New Review of Hypermedia and
Multimedia 7 (2001): 67-102.
Golder, Scott, and Bernardo Huberman. "The Structure of
Collaborative Tagging Systems." Information Dynamics Lab,
HP Labs (2005). <http://www.hpl.hp.com/researc
h/idl/papers/tags/tags.pdf>
Hammond, Tony. "Social Bookmarking Tools (I): A General
Overview." D-Lib Magazine (April 2005). <http://www.d
lib.org/dlib/april05/hammond/04hammond.ht
ml>
Lessig, Lawrence. Free Culture. . <http://www.free-c
ulture.cc>
Lynch, Clifford A.. "Institutional Repositories: Essential
Infrastructure for Scholarship in the Digital Age." ARL
Bimonthly Report 226 (2003).
Mathes, Adam. "Folksonomies - Cooperative Classification
and Communication Through Shared Metadata." Paper written
for LIS590CMC Computer Mediated Communication, Graduate
School of Library and Information Science at the University
of Illinois at Urbana-Champaign, December 2004. 2004. <ht
tp://www.adammathes.com/academic/computer
-mediated-communication/folksonomies.html>
McGann, Jerome. Radiant Textuality: Literature After the World
Wide Web. Palgrave MacMillan, 2001.
McGann, Jerome. "Culture and Technology: The Way We Live
Now, What is to Be Done?" Paper presented at the University
of Chicago, April 23, 2004. 2004. <http://www.nines.
org/about/bibliog/mcgann-chicago.pdf>
from="ROOT" targOrder="U"/>
MLA. "The Future of Scholarly Publishing: Report of the Ad
Hoc Committee on the Future of Scholarly Publishing."
Profession 2002 (2002): 172-186.
Nowviskie, Bethany. Collex: Semantic Collections and Exhibits
for the remixable Web . <http://www.patacriticism
.org/collex/about>
Nowviskie, Bethany, and Jerome McGann. NINES: A Federated
Model for Integrating Digital Scholarship . White paper version
available at <http://www.nines.org/about/9swhi
tepaper.pdf>
Unsworth, John. "Not-so-Modest Proposals: What Do We Want
Our System of Scholarly Communication to Look Like in
2010?" Paper presented at the CIC Summit on Scholarly
Communication, December 2, 2003. 2003. <http://www.
iath.virginia.edu/~jmu2m/CICsummit.htm>
Unsworth, John. "Tool Time, or 'Haven't We Been Here
Already?' Ten Years in Humanities Computing." Paper
presented at Transforming Disciplines: The Humanities and
Computer Science, NINCH conference, Washington, DC,
January 18, 2003. 2003. <http://www.iath.virginia
.edu/~jmu2m/carnegie-ninch.03.html>
Van de Sompel, H et. al. D-Lib Magazine (September 2004).

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ADHO - 2007

Hosted at University of Illinois, Urbana-Champaign

Urbana-Champaign, Illinois, United States

June 2, 2007 - June 8, 2007

106 works by 213 authors indexed

Series: ADHO (2)

Organizers: ADHO

Tags
  • Keywords: None
  • Language: English
  • Topics: None