Access versus Ownership: Navigating the Tension between Mass Digitization of Archival Materials and Intellectual Property Rights

poster / demo / art installation
  1. 1. Maggie Dickson

    University of North Carolina at Chapel Hill

  2. 2. Amy Johnson

    University of North Carolina at Chapel Hill

  3. 3. Natasha Smith

    University of North Carolina at Chapel Hill

  4. 4. Lynn Holdzkom

    University of North Carolina at Chapel Hill

  5. 5. Stephanie Adamson-Williams

    University of North Carolina at Chapel Hill

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The advent of digital technologies has presented archivists
with opportunities to provide unprecedented,
and, increasingly, expected, online access to collections
to their patrons. Some archival repositories are
now exploring, and in some cases, such as the Archives
of American Art Collections Online and the Library of
Congress’s American Memory Project, engaging in,
the mass digitization and Web presentation of their collections,
which allows patrons to perform much of the
same research from their homes or offices at any time
of day, for which they would traditionally have had to
travel great distances and been subject to limited hours
of operation.1
In 2007 the Carolina Digital Library and Archives
(CDLA) and the Southern Historical Collection (SHC)
began work on a pilot project to explore the most effective
methods of mass digitization and gather data on which a
full-scale program could be founded. A 2-year, $300,000
grant from the Watson-Brown Foundation of Thomson,
GA funded the project. The correspondence series of the
Thomas E. Watson Papers, housed in the SHC, was chosen
as the subject of the pilot project. Thomas E. Watson, of Thomson, Ga., was a prominent Populist politician,
author, and lawyer, and his correspondence series consists
of 8 linear feet of letters and related material written
by Watson and his family, friends, and political and business
colleagues. The date range of the correspondence
series is 1873-1986, with the bulk of the letters dating
between the 1880s and 1920s.
One of the challenges facing this project is navigating
between providing comprehensive online access while
at the same time respecting the intellectual property
rights of any copyright holders who may be represented
in the collection. Unpublished manuscript materials,
such as those found in the Thomas E. Watson Papers,
are protected by copyright for 70 years plus the life of
the author. For us, that means that any letters written by
a correspondent who died prior to 1939 (2009 being the
projected publication date for our digital collection) are
fair game under general copyright rules. However, any
letters written by a correspondent who died after 1939
are potentially still in copyright.
So, if we were not to claim any exemptions to copyright
statutes, and we wanted to publish the entire correspondence
series on the Web under a strict interpretation of
copyright law, we would need to identify all of the authors
of the letters in the series, determine the date of
their deaths, locate their descendants if their death dates
were after 1939, contact those descendants, and request
and then obtain permission to use their deceased family
members’ letters.
Even though we suspected this effort had little chance of
success, we determined that because our project is not
only about digitizing the Thomas E. Watson Papers, but
also serves as a pilot for a much larger effort aimed at
digitizing the entire SHC, we would attempt to do intense
copyright research on the materials in the series to
investigate thoroughly this aspect of digitizing archival
materials. Prior to beginning copyright research, we had
already gathered basic metadata (correspondent and recipient
names, places from which letters were written,
and dates, for example) from all of the 8400+ documents.
It took us approximately 91 hours to go through
the 8 linear feet of materials in the correspondence series
to compile this data. From the information we gathered,
we were able to condense and regularize the correspondent
list down to approximately 3300 names.
Using a variety of sources, including Wikipedia, the
Social Security Death Index,, and print
references, we attempted to positively identify the 3300
names. What resulted was a list of 3280 confirmed and
questionable identifications, and 24 unknowns that were
simply impossible to identify. We were able to locate
birth or death dates for 1709 of the identified correspondents,
while for 1571 no dates were available via the
consulted sources. Of the correspondents for whom we
located dates, 1101 died after 1939, while 608 died during
or before 1939.
Of the positively identified individuals who died after
1939, we found that just over 50 had dedicated manuscript
collections (or materials in the manuscript collections
of other individuals) deposited in repositories.
In these cases, we contacted the repositories, asking for
their latest acquisition information, in the hopes that it
might lead us to descendents of the correspondents from
whom we could request permission to digitize their relative’s
materials. In most cases, no information was available,
and when it was, it tended to be outdated, often
well over 20 years old. We were able to obtain current,
dependable contact information for the descendents of
only two of the more prominent correspondents – Upton
Sinclair and Hamlin Garland, both of whom are wellknown
writers with established literary estates.
The fact that it was so difficult to obtain contact information
for the descendents of people who are prominent
enough to have dedicated manuscript collections indicates
that locating the descendents of the bulk of the correspondents
would be daunting and in most cases impossible.
Extrapolating from our experience with the Watson correspondence,
we believe that an effort on the scale anticipated
in digitizing the entire SHC would be stymied by
trying to do in-depth exploration of copyright status and
attempting to obtain permission to digitize unpublished
archival materials that are under copyright. If we hope to
make mass digitization an integrated part of processing
archival materials, it is simply untenable for us to consider
doing this type of research to determine and obtain
This does not mean that we should avoid digitizing materials
which may still be in copyright, but it does mean
that we need to find a different way to approach copyright
law to accommodate our needs.
Relying on Fair Use
While copyright law was intended to protect creative expression,
it was at the same time not intended to be an
impediment to further creative expression. Because of
that, there are limitations to exclusive rights and remedies
in sections 107 to 122 of the Copyright Act that
allow for the use of copyrighted materials under some
circumstances. As the current copyright law was written in 1976, however, its authors did not anticipate the ways
in which digital technologies would change the potential
uses of copyrighted materials, and we must interpret
these limitations and remedies to determine which might
best apply to the mass digitization of archival materials.
A close examination of the possible limitations and remedies
available to us in the law as it stands indicated that
the most reasonable option for us is to use the fair use
provision. Section 107 of the Copyright Act – Limitations
on Exclusive Rights: Fair Use, states that ‘use by
reproduction in copies or phonorecords or by any other
means specified by that section, for purposes such as
criticism, comment, news reporting, teaching, scholarship,
or research, is not an infringement of copyright.’2
The Supreme Court, in its ruling in favor of the defendant
in Stuart v Abend, stated that: ‘fair use … permits
courts to avoid rigid application of the copyright statute
when, on occasion, it would stifle the very creativity
which that law is designed to foster.’3
Weighing our project against the four fair use factors4
and taking into account the existing case law (of which
very little applies to archives and special collections) we
have developed an argument which we feel allows us
to legally publish our digitized manuscript collections
online. Unfortunately, the only way to know with certainty
that a use is considered a fair one is to have it
resolved in a federal court. The thought of such a court
battle constitutes a worst-case scenario for us, but given
the precedents already set by the courts, we are unlikely
to become involved in such a situation.
Given these circumstances, it is reasonable for us to continue
to serve our patrons in the most effective ways possible
by accepting this risk and forging ahead with mass
digitization. In order to maintain the level of service
researchers are increasingly coming to expect, it is imperative
that archives and special collections forge ahead
with mass digitization without fear of recrimination.
Copyright Act of 1976, 17 U.S.C. § 107
Erway, R., and Schaffner, J. (2007). Shifting Gears: Gearing
Up to Get Into the Flow. Report produced by OCLC
Programs and Research. Published online at: http://www.
Stuart v Abend, 495 U.S. 207, 236 (1990)
1Ricky Erway and Jennifer Schaffner’s paper, “Shifting
Gears: Gearing Up to Get Into the Flow,” discusses the
changing expectations of archival user communities, as
well as the changing role of the archivist in the face of
developing digital technology.
2Copyright Act of 1976, 17 U.S.C. § 107. Limitations on
exclusive rights: Fair use
3Stuart v Abend, 495 U.S. 207, 236 (1990)
4They are, in brief: 1. the purpose and character of the
use, including whether such use is of commercial nature
or is for nonprofit educational purposes; 2. the nature of
the work itself [whether it is a factual or creative work];
3. the amount and substantiality of the portion used in
relation to the copyrighted work as a whole; 4. the effect
of the use upon the potential market for or value of the
copyrighted work.

Conference Info


ADHO - 2009

Hosted at University of Maryland, College Park

College Park, Maryland, United States

June 20, 2009 - June 25, 2009

176 works by 303 authors indexed

Series: ADHO (4)

Organizers: ADHO

  • Keywords: None
  • Language: English
  • Topics: None