If You Build It Will They Come? The Lairah Study: Quantifying the Use Of Online Resources in the Arts and Humanities Through Statistical Analysis of User Log Data

paper
Authorship
  1. 1. Claire Warwick

    School of Library, Archive and Information Studies - University of Sheffield

  2. 2. Melissa Terras

    School of Library, Archive and Information Studies - University College London

  3. 3. Paul Huntington

    School of Library, Archive and Information Studies - University College London

  4. 4. Nikoleta Pappa

    School of Library, Archive and Information Studies - University College London

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The LAIRAH (Log Analysis of Internet Resources in the Arts and Humanities) project aims to determine
whether, how and why digital resources in the humanities are used, and what factors might make them usable and sustainable in future. Based at UCL and funded by the UK Arts and Humanities Research Council (AHRC) ICT Strategy scheme, LAIRAH is a year long study which will analyse patterns of usage of online resources through real time server log analysis: the data internet servers collect automatically about individual users. This paper will discuss the findings of our research to date, and the techniques of log analysis as applied to major digital
humanities projects. In doing so, we address concerns about the use, maintenance and future viability of digital humanities projects, and aim to identify the means to which an online project may become successful.
Context
Hundreds of projects have been funded to produce digital resources for the humanities. In the UK alone, over 300 of them have been funded by the AHRC since 1998. While some are well known, others have been relatively quickly forgotten, indicating financial
and intellectual wastage. Little is known of user
centred factors determining usage. (Warwick, 1999) The
aim of the LAIRAH study (http://www.ucl.ac.uk/slais LAIRAH/) is to discover what influences the long-term
sustainability and use of digital resources in the humanities through the analysis and evaluation of real-time use. We are utilising deep log analysis techniques to provide
comprehensive, qualitative, and robust indicators of
digital resource effectiveness. The results of this research
should increase understanding of usage patterns of digital humanities resources; aid in the selection of projects for future funding, and enable us to develop evaluator measures for new projects.
Technical problems that can lead to non-use of digital projects are relatively well understood.(Ross and Gow, 1999) However, evidence of actual use of projects is anecdotal; no systematic survey has been undertaken, and the characteristics of a project that might predispose it for sustained use have never been studied. For example,
does the presence in an academic department of the
resource creator, or enthusiast who promotes the use of digital resources, ensure continued use? Do projects in certain subject areas tend to be especially widely used?
Are certain types of material, for example text or
images, more popular? Is a project more likely to be used if it consulted with the user community during its design phase? An understanding of usage patterns through log data may also improve use and visibility of projects.
Project aims and objectives
This project is a collaboration between two research centres at UCL SLAIS: CIBER (The Centre for Information Behaviour and the Evaluation of Research),
(http://www.ucl.ac.uk/ciber) and the newly created
CIRCAh (The Cultural Informatics Research Centre for the Arts and Humanities) (http://www.ucl.ac.uk/slais/
circah/). CIBER members are leading researchers in the use of deep log analysis techniques for the evaluation of online resources. (Huntington et al. 2003) Their strong record in user site evaluation in the health, media and scholarly publishing sectors is now being applied to the arts and humanities. We believe that no one has undertaken log analysis work in this sector: the LAIRAH project therefore offers great opportunities for knowledge and technology transfer.
The LAIRAH project is analysing raw server transactions of online digital resources (which automatically record web site use) and will relate these to demographic user data to provide a comprehensive and robust picture of resource effectiveness. CIBER have developed robust, key metrics and concepts such as site penetration (number of views
made in a session), returnees (site loyalty), digital
visibility (impact of positioning on usage), search success
(search term use and the number of searches conducted) and micro-mining (mapping individual tracks through
websites) in order to understand usage in the digital
environment and relate this to outcomes and impacts. (Nicholas et al., 2005) By applying this knowledge in the LAIRAH project, we are bringing quantitative and robust analysis techniques to digital resources in the Arts and Humanities.
Methods
Phase 1: Log analysis
The first phase of the project is the deep log analysis. Transaction and search log files have been provided by the three online archives supported by AHRC: the AHDS Arts and Humanities Collection (http://www.ahds.ac.uk/); Humbul Humanities Hub (http://www.humbul.ac.uk/) and the Artifact database for the creative and
performing arts (http://www.artifact.ac.uk/), (Humbul and
Artifact merged at the end of 2005). This provides rich data for comparing metrics between subject and resource
type. The search logs show patterns of which resources users are interested in, and in the case of the AHDS (which provides links through to resources themselves), which ones users go on to actually visit. This project is of limited duration, thus we are not able to analyse logs from individual projects at this stage of funding, given the difficulty of accessing log data from projects which may have limited technical support.
We are analysing a minimum of a year’s worth of
transaction log data (a record of web page use automatically
collected by servers) from each resource. This data gives a relatively accurate picture of actual usage, is seamless, and is easily available, providing: user information on the words searched on (search logs), the pages viewed (user logs), the web site that the user has come from (referrer logs), and basic, but anonymous, user identification tags, time, and date stamps. (Huntington et al., 2002) We have also designed short online questionnaires, covering user characteristics and perceived outcomes, which will be matched to actual search and usage patterns. We are also sharing the results of these and the log data analysis with another project, also funded under the AHRC ICT scheme,
(http://www.ahrbict.rdg.ac.uk/) the RePAH project
(User Requirements analysis for Portals in the Arts and Humanities http://repah.dmu.ac.uk/), based at De Montfort
and Sheffield universities. This project is studying the use of Portal sites in the arts and humanities, and is
testing new prototype portal designs, including applications
such as personalisation functions used on commercial
portals, to determine whether they are appropriate for
humanities users.
As part of the initial phase of the project, we have also carried out a study to determine how humanities users find digital resources and portal sites, when beginning their search from their university library or faculty web pages. This study is described in a separate poster
proposal.
Phase 2: Case Studies
Through the above log analysis, we have identified ten projects that have high and low patterns of use across different subject areas and types of content and these are being studied in depth. Project leaders and
researchers are being interviewed about project development, aims, objectives, and their knowledge of subsequent usage. Each project is analysed according to its content, structure, and design and whether it has undertaken any outreach or publicity. We are seeking to discover whether projects have undertaken user surveys, and if so how they responded to them and whether they undertook any
collaboration with similar projects. We are also asking about technical advice that the project received, whether from institutional support people, from Humanities
Computing Centres or from central bodies like the AHDS.
All these measures are intended to determine whether there are any characteristics which projects which continue to be used may share. For example does good technical
advice predispose a project to be usable or might contact with potential users prove as important? We shall also interview a small sample of users of each resource about their opinions about the reasons why it is useful for their work. This aspect of the project will be collaborative with another CIRCAh project, which is studying the reaction of humanities users to digital projects: the UCIS project.
We nevertheless recognise that it is also important to
study projects that are neglected or underused. We are therefore running a workshop with the AHRC ICT Methods Network to study the possibility of the reuse of neglected resources. A small group of humanities users will be given an opportunity for hands on investigation of a small number of resources and there will be time for discussion of factors that might encourage or deter their future use. We will seek to find out whether their lack of use is simply because users had not heard of a given resource, are whether there are more fundamental
problems of design or content that would make the resource
unsuitable for academic work.
Findings
Collecting the log data has proved to be an
unexpectedly difficult process. This in itself is
noteworthy, since it indicates that levels of technical
support even for large, government supported portals could still be increased. The situation for individual
projects is likely to be even more problematic, and suggests
that the issue of long term maintenance and support is one
that institutions and funding councils must take more
seriously.
We are currently beginning to analyse data, and, by July, we will be in the final quarter of the project. We will
therefore be able to report on the results of both the qualitative and quantitative aspects of the study. These should prove valuable to anyone at the conference who is currently running or planning to run a future digital resource for the humanities. Like all other matters in the humanities, building a digital resource which is successful in term of attracting and keeping users is not an exact
science. We do not mean to limit the creativity of
culture developers by suggesting the application of a rigid list of features to which all future projects must conform. Nevertheless, since resource creators spend such large amounts of precious time, effort and money on making their project a reality, they must surely be keen to see it used rather than forgotten. We aim to suggest factors which may predispose a resource to continued success, in terms of users: a topic of interest to project designers, project funders and users alike.
Acknowledgement
The LAIRAH project is funded by the AHRC ICT Strategy scheme. References
Huntington P, Nicholas D, Williams, P. (2003)
‘Characterising and profiling health web users and site types: going beyond hits’. Aslib Proceedings 55 (5/6): 277-289
Huntington P, Nicholas D, Williams P, Gunter B. (2002) ‘Characterising the health information
consumer: an examination of the health information
sources used by digital television users’. Libri 52 (1): 16-27
Nicholas D, Huntington P, and Watkinson A. (2005) ‘Scholarly journal usage: the results of a deep log analysis.’ Journal of Documentation 61 (2): 248 – 280.
Ross, S., and Gow, A. (1999) Digital Archaeology,
Rescuing neglected and damaged data resources. HATII,
University of Glasgow. Online http://www.hatii.arts.gla.ac.uk/research/BrLibrary/rosgowrt.pdf
Warwick, C. (1999) ‘English Literature, electronic text and computer analysis: an unlikely combination?’ Paper presented at The Association for Computers and the Humanities- Association for Literary and Linguistic Computing, Conference, University of Virginia, June 9-13.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

Complete

ACH/ALLC / ACH/ICCH / ADHO / ALLC/EADH - 2006

Hosted at Université Paris-Sorbonne, Paris IV (Paris-Sorbonne University)

Paris, France

July 5, 2006 - July 9, 2006

151 works by 245 authors indexed

The effort to establish ADHO began in Tuebingen, at the ALLC/ACH conference in 2002: a Steering Committee was appointed at the ALLC/ACH meeting in 2004, in Gothenburg, Sweden. At the 2005 meeting in Victoria, the executive committees of the ACH and ALLC approved the governance and conference protocols and nominated their first representatives to the ‘official’ ADHO Steering Committee and various ADHO standing committees. The 2006 conference was the first Digital Humanities conference.

Conference website: http://www.allc-ach2006.colloques.paris-sorbonne.fr/

Series: ACH/ICCH (26), ACH/ALLC (18), ALLC/EADH (33), ADHO (1)

Organizers: ACH, ADHO, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None