The Open Annotation Collaboration: A Data Model to Support Sharing and Interoperability of Scholarly Annotations

  1. 1. Jane Hunter

    University of Queensland

  2. 2. Timothy W. Cole

    University of Illinois, Urbana-Champaign

  3. 3. Robert Sanderson

    Los Alamos National Laboratory

  4. 4. Herbert Van de Sompel

    Los Alamos National Laboratory

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

This paper presents the outcomes to date
of the annotation interoperability component
of the Open Annotation Collaboration (OAC)
The OAC project is a collaboration
between the University of Illinois, the
University of Queensland, Los Alamos National
Laboratory Research Library, the George Mason
University and the University of Maryland.
OAC has received funding from the Andrew
W. Mellon Foundation to develop a data
model and framework to enable the sharing
and interoperability of scholarly annotations
across annotation clients, collections, media
types, applications and architectures. The OAC
approach is based on the assumption that
clients publish annotations on the Web and
that the target, content and the annotation
itself are all URI-addressable Web resources.
By basing the OAC model on Semantic Web
and Linked Data practices, we hope to provide
the optimum approach for the publishing,
sharing and interoperability of annotations
and annotation applications. In this paper,
we describe the principles and components of
the OAC data model, together with a number
of scholarly use cases that demonstrate and
evaluate the capabilities of the model in different
1. Introduction and Objectives
is both a core and pervasive
practice for humanities scholarship. It is used
to organize, create and share knowledge.
Individual scholars use it when reading, as
an aid to memory, to add commentary, and
to classify documents. It can facilitate shared
editing, scholarly collaboration, and pedagogy.
Although there exists a plethora of annotation
clients for humanities scholars to use (Hunter
2009) - many of these tools are designed
for specific collection types, user requirements,
disciplinary application or individual, desktop
use. Scholars are also confronted with having to
learn different annotation clients for different
content repositories, have no easy way to
integrate annotations made on different systems
or created by colleagues using other tools, and
are often limited to simplistic and constrained
models of annotations. For example, many
existing tools only support the simplistic model
in which the annotation content comprises a
brief unformatted piece of text. Many tools
conflate the storage of the annotations and the
target being annotated.
Frameworks for annotation reference are
inconsistent, not coordinated, and frequently
idiosyncratic, and the constituent elements
of annotations are not exposed to the Web
as discrete addressable resources, making
annotations difficult to discover and re-use.
The lack of robust interoperable tools for
annotating across heterogeneous repositories
of digital content and difficulties sharing or
migrating annotation records between users and
clients – are hindering the exploitation of digital
resources by humanities scholars. Hence the
goals of the Open Annotations Collaboration
(OAC) are:
To facilitate the emergence of a
Web and Resource-centric interoperable
annotation environment that allows
leveraging annotations across the boundaries
of annotation clients, annotation servers, and
content collections. To this end, annotation
interoperability specification consisting of an
Annotation Data Model will be developed.
To demonstrate through implementations
an interoperable annotation environment
enabled by the interoperability specifications
in settings characterized by a variety

of annotation client/server environments,
content collections, and scholarly use cases.
To seed widespread adoption by deploying
robust, production-quality applications
conformant with the interoperable annotation
environment in ubiquitous and specialized
services and tools used by scholars (e.g.,
JSTOR, Zotero, and MONK).
In the remainder of this paper we describe
related efforts that have informed the
development of our Annotation Data Model.
We then describe the data model itself that
lays a foundation for follow-on work involving
demonstrations and reference implementations
that exploit real-world repositories such
JSTOR, Flickr Commons
, and
and leverage existing scholarly annotation
applications such Zotero, Pliny and Co-Annotea.
2. Related Work
Despite the vast body of work regarding
annotation practice, annotation models, and
annotation systems, little attention has been
paid to interoperable annotation environments.
The few efforts in this realm to date comprise:
RDF-based Annotea developed by Kahan and
Koivunen (Kahan et al., 2001);
Agosti’s “A Formal Model of Annotations of
Digital Content” (Agosti and Ferro, 2007);
SANE Scholarly Annotation Exchange;
OATS (The Open Annotation and Tagging
System (Bateman et al., "OATS: The Open
Annotation and Tagging System")).
An analysis of these existing models reveals
that on the whole, they have not been designed
as Web-centric and resource-centric, or that
they have modeling shortcomings that prevent
any existing resource from being the content
or target of an annotation and from giving an
annotation independent status as a resource
itself. Further requirements that we have
identified that these approaches fail to fully
support include:
Resources of any media type can be
Annotation Content or Targets;
Annotation Targets or Content are frequently
segments of Web resources;
The Content of a single annotation may apply
to multiple Targets or multiple annotation
Contents may apply to a single Target;
Annotations can themselves be the Target of
further Annotations.
3. The OAC Data Model
By exploiting the Web- and Resource-
centric approach to modelling annotations,
we leverage existing standards and facilitate
the interoperability of annotation applications.
In the OAC model, an Annotation is an
Event initiated at a date/time by an author
(human or software agent). Other entities
involved in the event are the Content of
the Annotation (aka Source) and the Target
of the Annotation. The model assumes that
the core entities (Annotation, Content and
Target) are independent Web resources that are
URI-addressable. This approach simplifies and
decouples implementation from the repository.
An essential aspect of an annotation is the
(implicit or explicit) expression of “annotates”
relationship between the Content and the
Target. The model allows for Content and Target
of any media type and the Annotation, Content,
and Target can all have different authorship.
In situations where the annotation Content or
Target is a segment or fragment of a resource
(e.g., region of an image), we will draw on the
work of the W3C Media Fragments Working
Group to specify the fragment address. Figure
1 illustrates the alpha version of the OAC data
Fig. 1. The Alpha OAC Data Model

