It is well documented that men and women use informal language, such as conversation and correspondence, in rather different ways, reflecting a wide variety of cultural forces and practices.1 Until recently, however, there have been relatively few attempts to examine gender differences in more formal, published writing.2 Last year I presented to this conference some preliminary results from a computer-assisted analysis of two moderate sized samples of literary works by French male and female authors published primarily between the 18th to 20th centuries.3

Very clear distinctions were found between male and female literary writing, most notably in the selection of words used in the two samples, as reflected in use of marked differences in the rates of us of both pronouns and possessives (function words) and less frequent content words. Women's writing of the period may be generally characterized by a more personal, interactive, and involved style and is considerably more descriptive of internal and emotional states than male writing of the same period and genres. The lexical collocations -- a rough measure of the meanings of words or sets of words -- of a small set of selected words did not vary significantly, suggesting that the meaning of words, were not significantly altered by gender of writer in spite of different rates of use of these words. Finally, comparing the representation of women between male and female authors by looking at the rates of "possession" and other frequent adjectives, which may have been expected, did not result in significant differences by gender of author.

By implication, I suggested that the selection of words, reflecting the selection and treatment of topics, is not an unconscious effort. Women's writings of this period established a space in which women could write, in ways identifiably distinct from male authors, but within prevailing linguistic norms or, more specifically, using words with the same senses and associations as male authors. I drew this implication based on a relatively small number of comparisons. The range of function and content words that have a strong gender bias is very significant. It is most neatly summed up by the interesting example of body and soul. The female writers use âme 1.62 times as frequently as men, but use corps -- which signifies the physical body and a "political body" -- only half as often as male writers. Global comparisons and comparisons controlled for genre and time period all reveal gender preferences for different vocabularies which is all the more notable due to the observation that the words preferred by male authors are more likely used at high rates by female authors than the reverse.

Using an enhanced sample4 of works by French women, this paper examines in a more systematic nature male and female use of words that strongly correlated with one gender or the other. This is based on the identification of classes of terms with strong gender preference, such as the female preference for terms describing emotive, internal or subjective states which include affection, calme, chagrin, confiance, courage, douceur, douleur, désir, envie, estime, espère, espoir, espérance, joie, imagination, larmes, passion, pense, peur, plaisir, plaire, plaisir, respect, rêve, savoir (and verb forms), sensible, sentiment.*, sentir, solitude, souffrir, souvenir, sérieux, tendresse, trist, tristesse, vain, âme, and émotion.

Comparisons of the collocations of these sets of terms in the samples of male and female will be the first test to see if gender of author modifies the use of gender selected vocabulary. It is not clear, however, that lexical collocation is sufficiently fine grained to detect gender differences since it is designed to isolate more general patterns of meaning.

Indeed, evaluation of individual agency in a text or communicative act may be a necessary corrective to the rather mechanistic model proposed by statistical approaches such as word counts and lexical collocation. The second part of this paper will isolate sentences containing a set of gender biased vocabulary, in works by male and female authors, to examine in more detail the discursive structures in which these are used. While the concordance and textbase are powerful tools to find and extract passages with particular sets of terms, a second order formalism would appear to be required in order to provide a more systematic analytical model. This part of the study will rely on elements of the theory of Functional Grammar proposed by Simon Dik6 to identify a set of relationships between the selected terms and the semantic, syntactic and pragmatic functions within the clause and/or sentence in which they appear. The features to be identified and tagged largely without computer intervention include the position of the term in the clause (first words are more important); the syntactic function of the term; in the case of so-called relational nouns, the identity (gender) of the (implied) referent: author/reader/third party; and if the term appears as the Topic or Focus of the clause, binding it to larger rhetorical structures.

In the following sentence from Sand's Indiana containing the word courage,

Cette lettre, Raymon n'eut pas le courage de la lire jusqu'au bout.

we see the identity of the referent and subject is a male (Raymon), that courage is a direct object; it does not appear in a clause-initial position and it is neither the Topic or Focus of the sentence. Identification of a small set of functional features in a select set gender biased words will shed considerable light on the degree to which male and female authors used these words in similar or different ways.

Use of formalisms such as functional grammars to provide additional interpretive or analytical guidance on results extracted from large datasets by more traditional full text searches may be another way to help "summarize" or digest the very large number of "hits" that can be generated by the current generation of full text search and analysis engines. Automatic filtering and counting of result sets by even the simpliest of criteria, such as the position of a word in a clause, may be a fruitful way to bridge the gap between computation as quantification and computation as a method to more handling more abstract levels of meaning and linguistic function.

Differing patterns of word use between male and female writers in France in the past several centuries suggests that women writers were consciously making a public space for their voices, by selecting and treating themes in markedly distinct ways. It is less certain whether they modified the meanings of the words they used or simply used the words in the same way as male writers. This is an important distinction as it bears upon the degree to which language is gendered, reflecting prevailing power relations, and mechanisms by which gender marking in language is both preserved and overturned.

