The State of Authorship Attribution Studies: (1) The History and the Scope; (2) The Problems -- Towards Credibility and Validity.

Joe Rudman; David I. Holmes; Fiona Tweedie; R. Harald Baayen

Authorship

1. Joe Rudman

Carnegie Mellon University
2. David I. Holmes

University of the West of England
3. Fiona Tweedie

University of Glasgow, Department of Statistics - University of Glasgow
4. R. Harald Baayen

Max Planck Institute for Psycholinguistics - University of Nijmegen

Original URL

https://web.archive.org/web/20010306031414/http://www.cs.queensu.ca/achallc97/papers/s004.html

Work text

This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

The State of Authorship Attribution Studies: (1) The History and the Scope; (2) The Problems -- Towards Credibility and Validity.
Joe Rudman
Carnegie Mellon University
rudman@cmphys.phys.cmu.edu
David I. Holmes
University of the West of England
david.holmes@csm.uwe.ac.uk
Fiona J. Tweedie
University of Glasgow, United Kingdom
fiona@stats.gla.ac.uk
R. Harald Baayen
Max Planck Institute for Psycholinguistics
baayen@mpi.nl
Keywords: authorship attribution, stylistics, statistics

Session Abstract
There are many serious problems with the science of authorship attribution studies. This session proposes to look at the history of the field, identify many of the more major problems, and offer some solutions that will go a long way towards giving the field credibility and validity.
Willard McCarty's recent posting on "Humanist" (Vol. 10, No. 137) "Communication and Memory" points out one of these problems, "...scholarship in the field is significantly inhibited, I would argue, by the low degree to which previous work in humanities computing and current work in related fields is known and recognized."

A major indication that there are problems in a field is when there is no consensus as to correct methodology or technique. Every area of authorship attribution studies has this problem -- research, experimental set-up, linguistic methods, statistical methods....

It seems that for every paper announcing an authorship attribution method that "works" or a variation of one of these methods, there is a counter paper pointing out crucial flaws:

Donald McNeil points out that scientists disagree as to Zipf's law;
Christian Delcourt raises objections against current practice in co-occurrence analysis;
Portnoy and Petersen show errors in Radday and Wickmann's use of the correlation coefficient, chi-squared test, and t-test;
Hilton and Holmes showed problems in Morton's QSUM techniques;
Smith raised many objections against Morton's early methods;
There is Merriam vs Smith;
There is Foster vs Elliott and Valenza.
This widespread disagreement has not only kept authorship attribution studies out of most United States court proceedings, but it also threatens to undermine even the legitimate studies in the court of public and professional opinion.
The time has come to sit back, review, digest, and then present a theoretical framework to guide future authorship attribution studies.

The first paper, by David Holmes, will give the necessary history, scope, and present direction of authorship attribution studies with particular emphasis on recent trends.

The second paper, by Harald Baayen and Fiona Tweedie, will focus on one problem: the use of so-called constants in authorship attribution questions.

The third paper, by Joseph Rudman, will point out some of the problems that are keeping authorship attribution studies from being universally accepted and will offer suggestions on how these problems can be overcome.

Full text license: This text is republished here with permission from the original rights holder.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ACH/ALLC / ACH/ICCH / ALLC/EADH - 1997

Hosted at Queen's University

Kingston, Ontario, Canada

June 3, 1997 - June 7, 1997

76 works by 119 authors indexed

Conference website: https://web.archive.org/web/20010105065100/http://www.cs.queensu.ca/achallc97/

Series: ACH/ALLC (9), ACH/ICCH (17), ALLC/EADH (24)

Organizers: ACH, ALLC

The State of Authorship Attribution Studies: (1) The History and the Scope; (2) The Problems -- Towards Credibility and Validity.

1. Joe Rudman

2. David I. Holmes

3. Fiona Tweedie

4. R. Harald Baayen

ACH/ALLC / ACH/ICCH / ALLC/EADH - 1997