Disambiguating perroquet in the roman : Modernizing Firthian principles with computational tools

  1. 1. G. Aileen Clark

    Université d'Ottawa (University of Ottawa)

Work text
The idea of disambiguation is not a new one. However, computational tools of analysis have enabled modern researchers to resurrect once disregarded approaches, such as collocational analysis. In the early 1950s, the British semantician J.R. Firth proposed a theory of collocational study. In his work Speech, he suggested that by analyzing the distribution surrounding a given ambiguous word, one could then predict the environment that would connote one meaning versus another. Until recently, attempting this type of analysis seemed unthinkable. One hesitated because finding an exhaustive list of a word's occurrences would be nearly impossible. Thanks to computer databases such as ARTFL, we are now able to request a search for a word to analyze the distribution through collocation as Firth intended.
This paper shows how we can modernize our approach to linguistic disambiguating by using Web databases such as ARTFL to collect and sort data. It presents conclusions derived from a collocational analysis of perroquet, followed by a commentary on the significance of these findings as they apply to the study of exotic terminology within the history of the novel.
The term perroquet first appeared in French literature during the seventeenth century. Used to colour and beautify literary language, the term 'perroquet' belongs first to exotic vocabulary. Upon closer analysis, one also finds the term perroquet used in the literal sense, to mean "the bird" itself. The semantic duality of perroquet justifies further study of this term, as it appears in 17th century literature, but also as it manifests itself in modern and contemporary literature. The first point of interest lies in predicting which of the two meanings is connoted within a given context of perroquet. By studying the contextual environment which surrounds an ambiguous term like perroquet, the reader of the modern era can transcend temporal gaps, ultimately grasping the meaning of a text fragment. This fosters a better understanding of the work as a whole.
This study's main objective is to analyze the term perroquet within its distribution in order to disambiguate its meaning. By querying ARTFL in all types of documents, we found 771 occurrences. Given the nature of a collocative study, which relies on syntactic and semantic distribution, certain genres would risk skewing the results. This is the case with poetry, which arbitrarily chooses the distributional context that surrounds a given word. So it is appropriated to restrict the corpus in a way that included occurrences that manifest themselves in the roman. This restricted query of the ARTFL database found 482 occurrences of the word perroquet, total of which formed the corpus for study.
The second part of the research classifies the occurrences in such as way as to regroup occurrences with similar semantic or syntactic features. This part of the research involves creating a semantic classifier to significantly reduce the time spent on classifying the occurrences. For perroquet, semantic similarities within a 20-word distribution are identified. This allows prediction of the types of context that connote one function of perroquet rather than the other. This work builds on an earlier study of the term bienseance in 17th century France, and offers another example of a study which successfully uses the collocational approach through syntactic distributional similarities.The first main classification identifies all the literal uses of perroquet. Perroquet, in collocation with one or more verbs with the semes (+perroquet, +action) connoted literal meaning of "the bird." Using the same method of classification shows that exotic uses of perroquet show a predominant type of collocational relationship - the term perroquet is preceded by a comparison connector such as comme or ainsi que.Thus, by using the context, one can predict the meaning of the word perroquet. Computers in collocational text analysis foster a method of semantic disambiguation which establish contextual patterns. These in turn determine which meaning of perroquet is produced. By allowing the researcher to remain within the text, to read and analyze using nothing but the text itself, this method of disambiguating terms is a useful tool for linguistic analysis.Having looked at the purely linguistic aspect of this study, one is led to question the practicality of this type of analysis. What is gained from knowing when perroquet is used in the exotic meaning instead of the literal meaning? The data acquired from the linguistic study of perroquet furthers literary analysis of the term as it appears within a certain genre (i.e., the roman.) From a literary study point of view as it provides a basis from which to analyze the uses of exotic vocabulary in the novel, both modern and classical. Preliminary analysis suggests that the exotic uses of the term perroquet decrease over time in the novel. This suggests that, as society opened its borders and expanded its horizons, literature incorporated once exotic terminology as part of everyday vocabulary.
