Is Variance of Function Words a Reliable Discriminator of Single and Multiple Author Corpora of Latin Prose? An Empirical Critique of Meissner's Studies of the Historia Augusta

The problem of the Historia Augusta has been the subject of much debate for many decades. This biographical collection of Roman emperors covers a period from AD 117-285 and is attributed in the mss. to six different authors. In 1889 however, Dessau, having studied the nomenclature and style of the works in the HA, proposed the theory that it was written only by one author.

Almost a century later, Marriott offered support for Dessau's theory through stylometric analysis. His first study compared the average number of words per 'sentence' of the Historia Augusta with those of other fourth century texts such as the Codex Theodosianus. His second study made a similar comparison based on the choice of word-type (part of speech) at the beginning and end of sentences. Both Marriott's studies showed similiarities among the biographies of the HA but differences from the control texts. Based on these findings he concluded that the collection was authored by one person, as Dessau had proposed.

Recently, Frischer, et al. published critiques of Marriott's work by extending the study to include control texts which fall under the same genres of the Historia Augusta, namely biography and history. Results that are worth noting surfaced: (1) The average numbers of words per 'sentence' in the HA and the control texts were fairly close. Marriott's averages, which included generically-unrelated control texts, had a broader range. (2) Tests by word-type also showed stylistic similarities between the HA and such authors as Livy, Tacitus, and Suetonius. On the other hand, Marriott's two control texts for his second study turned out to be eccentric. These results tend to question the validity of Dessau's theory of single authorship as well as Marriott's methods for stylometric investigation.

Thus, to make further progress in assessing Dessau's thesis, a test is needed which is sensitive enough to detect stylistic features specific to the author and not merely the tradition in which he and his colleagues wrote. One potentially powerful tool is function word analysis, on which Meissner published a paper in 1992.

In his study, Meissner compared the frequency of certain function words in the Historia Augusta to those found in Suetonius' De Vita Caesarum. He specifically focused on those words which appeared most frequently throughout the texts. These function words are the following seven: ad, cum, est, et, in, non, and ut. Meissner then ran statistical tests on the results to address two concerns. He first questioned whether the fluctuating behavior of the frequencies in the HA could be interpreted as random or pattern-like. As a result, Meissner ran chi-square tests. The values calculated, by Meissner's reasoning, suggest that the text is homogeneous and from one source.

Meissner's second concern was quantifying the direct relationship between the frequencies found in the HA and those in Suetonius. The former should fluctuate in a similar way to the latter, if the Historia Augusta was indeed written by a single author. To address this issue, Meissner studied their variances and ran F-tests which are the ratios between the variances of the two samples. The two samples in our case, are texts of the HA and Suetonius. If they are from the same population, their variances should be similar. For our purposes, we regard the same population as two texts both being drawn from single-authored corpora or both from multiple-authored corpora. If F is a much greater value, however, then no substantial likeness between the two texts has been detected. So this F-test provides a numerical standard by which we can consider whether two texts come from the same source.

In Meissner's study, the variances for each of the function words in the Historia Augusta were larger than Suetonius', exhibiting a larger fluctuation and dispersion. Similarly, the F-values were quite large, allowing us to reject a similar source for the two texts. Thus, Meissner concluded the HA is not single-authored, as the De Vita Caesarum is.

Meissner's use of function word analysis seems sensitive and subtle enough to detect differences between the HA and Suetonius; and all the statistics appear to point to a multi-authored Historia Augusta. However, the study is not empirical enough, since it used only Suetonius as a basis for comparison. Some data were gathered on Nepos, but were not fully exploited.

The new study presented here strives to offer more empirical evidence by investigating the frequencies of the function words in the works of the historians Livy and Tacitus. We begin by noting the frequencies of each of the seven function words per text. From each of the four extant decades of Livy's Ab Urbe Condita, five books were taken. Similarly, for our sample from Tacitus, in addition to his three non-historical pieces, we take five books from the Annales and five from the Historiae.

Next we test for randomness, as Meissner did, by conducting chi-square tests on the data, along with the revised ones on the HA and Suetonius. We also examine their p-values. The probability for Suetonius is .118 while the HA's is extremely low, which rejects the null hypothesis for independence and seems to confirm Meissner's studies. On the other hand, when we observe the values for Livy and Tacitus, we find results that do not confirm Meissner. Both p-values are .000. These are values we do not expect, since the texts were both written by one author.

In Meissner's study, the relationship of the frequencies between texts was also quantified by calculating their variances and running the F-test. In order to better study the relationship between single-authored corpora and the HA, multi-authored corpora were artificially created in the study. By including these newly created works for the F-tests, we can judge how reliable a discriminator the variance of function words really is. A number of results do not correspond to our expectations. The p-values from chi-square tests on Livy and Tacitus both come out to be .000, which, according to Meissner's reasoning, might suggest that some of the texts in their corpora were not written by these two authors. One explanation may be that the values are reflecting a change in style over time--a possibility that is not improbable when we examine their frequencies by percent.

The percentages also indicate that the frequencies remain fairly constant throughout Tacitus' Historiae and Annales. However, we detect some fluctuation patterns in the books of Livy for certain function words. These fluctuations can be a cause for the unexpected x2 values found. Whatever they case may be, the problem with the chi-square test is that it is too sensitive. By observing differences within a homogeneous text, it makes distinctions in more ways than we had intended. For this reason, the chi-square can be misleading and should not have been included in Meissner's study.

Although the F-test does not make these kinds of fine distinctions, we encounter difficulty with the values, which are generated at a 29% rate of error. One problem is that the null hypothesis is ambiguous, stating that two samples come from the same population. The same population, for Meissner's purposes, is single-authored or multi-authored corpora. However, it can also undeniably be interpreted as one author vs. another, such as Livy vs. Tacitus. So, in fact we have two possible hypotheses operating here.

Another possibility for re-interpretation is to look at specific function words that may reflect what we expect, rather than at the mean F-ratios. When we reexamine the values, we find that those for est and ut are excellent examples. The F-ratios are lower in known single vs. known single author corpora as well as in the HA vs. another known multiple author work. Similarly, the ratios are higher in known single vs. known multi-authored texts. It is likely that est and ut may serve as "magic keywords" in the analysis of function words. Because of the structure of the Latin language, est and ut are syntactically multi-functional. The remaining five words, however, are much less so. Thus, it is not surprising that est and ut are better discriminators in tests of variance.

In conclusion, Meissner's study of variance of function words is flawed but promising. Some issues in the theory behind it still remain unresolved, such as the effects of composition over a long period of time. Yet, it would still be worthwhile to continue exploring the reliability of variance of function words. For instance, the emergence of est and ut as powerful tools needs further investigation. In addition, there is the possibility of other 'multi-functional words' which may help us detect single and multiple authorship. Given these avenues for further analysis, variance of function words may still prove itself a powerful discriminator.

