Testing Authorship in the Personal Writing of Joseph Smith Using NSC Classification

  1. 1. Matthew Jockers

    Stanford University

Overview In a co-authored paper published in Literary and Linguistic
Computing (2008), my co-authors and I employed
both delta and Nearest Shrunken Centroid (NSC)
classification in an authorship analysis of the Book of
Mormon. Our results suggested that several men involved
in the early formation of the Mormon church
were likely contributors to the Book of Mormon. For reasons
detailed below, we were unable to include Mormon
prophet Joseph Smith in our authorship tests. The work
presented here attempts to develop a sizeable Smith corpus
by using a small set of documents in his own handwriting
as a training model for evaluating other documents
attributed to Smith but written in the handwriting
of one of Smith’s 24 different scribes.
For the aforementioned article, we compiled a corpus of
source material from five candidate authors who were
involved in the early LDS church. We had hoped to include
Joseph Smith as a candidate, but in the course of
our research determined that there was not enough reliable
writing by Smith to constitute an ample sample for
testing of his signal in the Book of Mormon. As Smith
biographer Dean Jessee makes clear in the introduction
to Personal Writings of Joseph Smith, Smith’s speeches,
letters, and even journal entries were frequently written
by scribes or written in tandem with one or more of his
collaborators. In another article that appears in the pages
of the “Joseph Smith Papers” online archive (n.d.), Jessee
writes, “only a tiny proportion of Joseph Smith’s
papers were penned by Smith himself.” In many of the
documents Jessee has collected, we see the handwriting
of Smith interwoven with the handwriting of his scribes,
sometimes side by side in the exact same letter, journal
entry, or document.
Mormon history informs us that Smith frequently used
scribes and that he dictated his thoughts to them. Indeed
the entire Book of Mormon is said to be a verbatim transcript
of Smith’s dictation. With regard to documenting
his visions, thoughts, and experiences, Smith’s “philosophy”
writes Jessee, “was that ‘a prophet cannot be
his own scribe.’” That said, on some occasions Smith
did put pen to paper, sometimes alone and sometimes in
tandem with others. Though Jessee has “attributed” the
spirit and content of all of these documents to Smith, the
manuscripts show clear physical evidence of other hands
at work; thus, the question remains as to whether these
scribes were “authoring” or merely “transcribing.”
These manuscripts, though not reliable for use as samples
of determined authorship in our prior research, do
provide fertile ground for another sort of closely related
stylistic inquiry and allow an opportunity to investigate
the question of whether Smith’s various scribes may
have contributed more than simple transcription. For
this new research I utilize the models of known authorship
we developed in our prior work in order to analyze
the personal writings attributed to Smith (but written by
The goal of this work is to assess the role (if any) that the
scribes may have had in shaping the linguistic and stylistic
construction of these documents. For example, if
sections written in the hand of Sidney Rigdon are classified
as being similar to the Rigdon signal in our exisiting
model, such as result would suggest that the role Rigdon
played in the dictation process was perhaps more than
mere scribe. Alternatively, if the material not in the hand
of Smith is classified together with material that is in his
hand, then this would be evidence favoring attribution to
Smith and Smith alone.
At the time of this proposal, preliminary results indicate
that at least 20 of the 109 documents penned by Smith’s
scribes are stylistically close to those written in Smith’s
own hand. Furthermore, early results also suggest no
apparent stylistic connection between those scribes for
whom we have known writings and the documents that
they wrote in the role of scribe to Smith. Together, these
findings appear to support the historical Mormon church
perspective of common authorship for both the papers in
Smith’s hand and those in the hands of his many scribes.
Further tests, to be completed before presentation of this
research, are necessary to confirm the veracity of these
preliminary results. Should further study confirm the
preliminary data, then we would have some justification
for attributing a sizeable number of these scribe-written
texts to Smith. Providing additional authentication of
the Smith corpus in this manner would be of great value
to future studies of the Book of Mormon.
I begin by segmenting the personal documents attributed
to Smith according to differences in handwriting.
For documents not in Smith’s handwriting, I label them
based on Jessee’s identification of the scribe who took the dictation. From the material in Smith’s own hand, I
expand our current classification model to include a new
“Smith” class (the current model includes signals for
Oliver Cowdery, Sidney Rigdon, and Parley Pratt who
were among Smith’s known scribes).
Through cross-validation (and testing with various tuning
parameters) I determine the most effective number of
features for a new model. In our prior work, NSC had a
cross-validation error rate of just 8.8% when using 110
features; the model accurately classified known samples
91.2% of the time. I anticipate similar results for this
proposed research.
Using the new model, the corpus of personal writings
attributed to Smith, but not in his hand, will be classified
and the results ranked based on the probabilistic output
that NSC provides. The results will provide further evidence
as to the consistency of the linguistic signal across
the corpus and provide a foundation for further research.
