Collecting Existing HTML Web Sites

  1. 1. Robert Cordaro

    Libraries - University of Virginia

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

An area of future concern for universities and academic libraries will be
the long term support of existing research stored and presented as HTML web
sites. There has been a recent proliferation of HTML as the final format
of scholarly research projects and theses. The long term viability of such
resources is in question if they remain as free-standing islands of
information, particulary if the originating researcher is no longer
actively maintaining the site. Changes in HTML server and browser software,
problems with server hardware, and policy changes in institutions may cause
the material to become inaccessable.

One alternative is to "collect" the HTML webpages and move them into a
library environment, possibly transforming the storage or access
format. Moving them into a centrally supported digital repository or
archive and possibly transforming the HTML into some other format should
extend the useful life of scholarly work that may be otherwise be lost. We
will discuss the pro's and con's of such a process and what problems arise
in the effort. We will look at issues involved in deciding whether a site
is suitable for collection, methods of limiting the scope of the site,
problems involved in moving the contents and developing software to
transfer and possibly transform the format, issues in trying to preserve
the presentation look and feel, and possible format options for
storage. By way of example we will discuss the successes and failures
encountered while developing a software tool as part of the Supporting
Digital Scholarship project.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review


Hosted at New York University

New York, NY, United States

July 13, 2001 - July 16, 2001

94 works by 167 authors indexed

Series: ACH/ICCH (21), ALLC/EADH (28), ACH/ALLC (13)

Organizers: ACH, ALLC