Binomials and the Computer: a Study in Corpus-Based Phraseology

paper
Authorship
  1. 1. Ourania Hatzidaki

    University of Birmingham

Work text
This plain text was ingested for the purpose of full-text search, not to preserve original formatting or readability. For the most complete copy, refer to the original conference program.

This paper presents the results of a large-scale corpus-based study (Hatzidaki 1999) of an important feature of English phraseology, namely binomial pairs (e.g. chalk and cheese, up and down, prim and proper, through and through, Laurel and Hardy). The purpose of the research was two-fold, firstly, to conduct, on the basis of a large and varied corpus of English textual data, a thorough and in-depth structural and functional analysis of a much-studied and yet not fully explored phraseological phenomenon; and secondly, to examine the hypothesis that the use of systematically collected samples of authentic language data results in more accurate and comprehensive descriptions of the form and function of linguistic phenomena than does the sole reliance on introspection. The analysis yielded an extensive and rigorous taxonomy of the various structural variations of binomials, as well as significant new information on their function in the communicative process.

Binomials, namely sequences of "two or more words or phrases belonging to the same grammatical category, having some semantic relationship and joined by some syntactic device such as 'and' or 'or'" (Bhatia 1994:143), have long been objects of interest for idiomatologists and stylisticians. The several existing studies of this phenomenon have mainly focussed on its marked occurrence in the works of certain literary authors such as Chaucer, Lydgate, Shakespeare, Swift, Shaw, etc. (see, respectively, Héraucourt 1939 and Potter 1972, Tilgner 1936, Nash 1958 and Gerritsen 1958, Milic 1967, and Ohmann 1962), as well as in English legal texts; the semantic and syntactic characteristics and idiosyncrasies of the various paired forms, especially the semantic relationship between the linked members of a binomial (synonymy, able and talented; antonymy, boys and girls; complementarity, bow and arrow; Malkiel 1959), or the notion of irreversibility, i.e. the tendency of binomials to occur in only one sequence, as in here and there and not *there and here, and the possible causes of this phenomenon (e.g. 'proximal before distal'; Cooper and Ross 1975); and the incidence of binomials in languages other than English (e.g. Fix 1985 for German; Abraham 1950 for French and Italian; Malkiel 1959 for Russian, Portuguese, Spanish, Ancient Greek and Latin; Gold 1991 for Yiddish; Koch 1983 for Arabic; Szpyra 1983 for Polish; etc.).

As opposed to literary studies, where binomials are treated as a flexible and interesting stylistic device which serves as a powerful means of expressing the authors' ideology and worldview, most studies of the occurrence of this feature in general language implicitly or explicitly regard binomials as a small and probably finite set of structurally and semantically idiosyncratic forms. Moreover, although many studies of the formal characteristics of binomials are available, there exists no comprehensive account of the full structural variability of the binomial pairs used by the average speaker, no detailed information on the distribution of the different patterns, and no organized taxonomy of forms. Finally, with the notable exception of studies of binomials as a distinctive feature of the language of the law which fulfils the requirements of legal draftsmanship for precision, clarity, unambiguity and all-inclusiveness (Mellinkoff 1963, Gustafsson 1984, Bhatia 1994), minimal attention has been given to the functions of binomials in non-literary language.

Crucially, with very few exceptions (notably Gustafsson 1975), previous treatises on binomials have been intuition-based. A glance at a general corpus, however, instantly reveals a number of new and interesting facts concerning this feature. Firstly, numerous paired forms emerge, which appear to have been modelled on an abstract dualistic structure of the A + link + B type, very few of which, however, represent familiar, idiomatic locutions such as the oft-quoted rough and ready and out and out: the majority of the couplets appearing in corpus data constitute novel sequences such as calm and united, gently and effectively, inflation and unemployment, etc., whose formation seems to be governed by the specific lexicogrammatical, discoursal and pragmatic rules pertaining to the production of the texts in which they are encountered. Secondly, although couplets are extremely varied in their structural details, they all seem to fall into a set of identifiable lexicogrammatical patterns. And thirdly, the occurrence of the various dualistic patterns in textual sources with different situational characteristics demonstrates substantial distributional fluctuations.

The above facts indicate that, in order to effectively account for the phenomenon of binomial pairing as it is observed in a corpus of texts, a new and more flexible data-driven framework needs to be devised. In the light of the data used in the present research, rather than a list of structurally and semantically peculiar couplets, binomials are analyzed as an abstract mechanism which speakers have at their disposal for the generation of a very wide range of paired types that serve a variety of important communicative purposes. As a theoretical model for the identification and extraction of binomials from the corpus and the classification of their various lexicogrammatical variants into a set of categories, we exploit the notion of phraseological frame or formal idiom, as posited and developed by Moon (1998:154f) and Fillmore, Kay & O'Connor (1988:505f). This, in very broad terms, represents an abstract structural formula which, as Fillmore et al. put it, 'serves as host' (ibid.:506) to institutionalized expressions as well as novel, spontaneously created forms.

Binomials emerge as a major frame which can be represented by means of the general formula A link B. Our data analysis, which results in the construction of a detailed and comprehensive data-driven taxonomy of binomial patterns, involves, firstly, the identification and extraction of the various binomial forms from our corpus of textual data; secondly, the devising of a prototypical system of abstract representations to which each extant pair is assigned on the basis of its lexicogrammatical attributes; thirdly, the detailed recording of any interesting lexicosemantic preferences displayed by the patterns (for instance their semantic prosodies; Louw 1993 and Sinclair 1996); and, finally, the calculation of the frequency of occurrence of each pattern in the corpus.

We also discuss in detail the important but rarely addressed issue of the function of binomials in the communicative process. Specifically, we examine the incidence of the various binomial patterns in each of the six subcorpora comprising our corpus (a set of written publications in book form, both fiction and non-fiction; a broadcasting medium; a semi-specialized periodical publication; two daily newspapers, a broadsheet and a tabloid; and a set of spontaneous and semi-spontaneous spoken texts), and seek explanations for the very substantial distributional perturbations. The main purpose of this exercise is to establish the nature and extent of the correlation between the form and structure of binomial patterns on the one side, and the extralinguistic and situational factors pertaining to each subcorpus on the other, and, thus, to determine the precise functions served by each binomial pattern in communication.

Our data strongly suggest that binomials constitute a phraseological device which makes a highly significant contribution to the communicative process. Our analysis demonstrates that, depending on their structure as well as the type of text in which they are encountered, binomials serve a wide range of communicative functions. For instance, it is shown that the abundant use of informationally dense binomials (e.g. government and parliament, political and monetary, commercial and investment banks) on the part of journalists serves most effectively the institutional requirements of the mass media for factuality, informativeness, precision, conciseness and stylistic uniformity (Crystal & Davy 1969, Tuchman 1978, van Dijk 1988, and elsewhere), whilst simultaneously disguising the highly fragmented process of production of news texts (Bell 1991).

On the other hand, the frequent employment of repetitive, vague or informationally sparse pairs in conversation (ages and ages, here and there, try and get) reflects the efforts of conversationalists in the face of the exigencies of real-time communication. In the context of unplanned talk, binomials act as a lexicalized and, therefore, elegant and well-integrated temporal space which speakers create automatically and with the minimum of cognitive effort whilst coping with delays in the formulation of thought and argument. Binomials in extemporaneous conversation act as a crucial discourse-cohesive device, which helps keep speech 'glued together' (Johnstone 1987), whilst minimizing the effect of fragmentation (Chafe 1982) created by phenomena such as false starts, random repetition (Norrick 1987), etc. At the same time, binomials may be used by speakers as a means of expressing emphasis and emotional involvement and of creating rhetorical presence (e.g. faster and faster, ringing and ringing).

On the whole, the corpus-based structural and situational analysis of binomials not only offers new and significant information on a well-known linguistic phenomenon, it also offers substantial empirical support for the hypothesis that phraseology plays a major part in the accomplishment of the communicative goals of speakers or writers (for a review of relevant studies, see Hatzidaki 1999).

Bibliography

Abraham, R.D. (1950). Fixed Order of Coordinates. Modern Language Journal 34, 276-287.
Bell, A. (1996). The Language of News Media. Blackwell, Oxford.
Bhatia, V. (1994). Cognitive structuring in legislative provisions. In J. Gibbons (ed) Language and the Law, Longman, London.
Chafe, W.L. (1982). Integration and Involvement in Speaking, Writing, and Oral Literature. In D. Tannen (ed) Spoken and Written Language. Ablex, New Jersey.
Cooper, W.E. & Ross, J. R. (1975).World Order. In R. E. Grossman, J. L. San and T. J. Vance (eds) Papers from the Parasession on Functionalism. Chicago Linguistic Society, Chicago.
Crystal, D. and Davy, D. (1969). Investigating English Style, Longmans, London.
Fillmore, C.J., Kay, P. and O'Connor, M.C. (1988). Regularity and Idiomaticity in Grammatical Constructions: The Case of Let Alone. Language 64(3), 501-538.
Fix, U. (1985). Wortpaare im heutigen Deutsch. Sprachpflege 34(8), 112-113.
Gerritsen, J. (1958). More Paired Words in Othello. English Studies 39, 212-214.
Gold, D.L. (1991). Reversible Binomials in Afrikaans, English, Esperanto, French, German, Hebrew, Italian, Judesmo, Latin, Lithuanian, Polish, Portuguese, Rumanian, Spanish and Yiddish. Orbis 36, 104-118.
Gustafsson, M. (1975). Binomial Expressions in Present-day English. Turun Yliopisto, Turku.
Gustafsson, M. (1984). The syntactic features of binomial expressions in legal English. Text 4(1-3), 123-141.
Hatzidaki, O. (1999). Part and Parcel: A Linguistic Analysis of Binomials and its Application to the Internal Characterization of Corpora, Ph.D. Thesis, University of Birmingham.
Héraucourt, W. (1939). Die Wertwelt Chaucers, Carl Winters, Heidelberg.
Johnstone, B. (1987). An Introduction. Text 7(3), 205-214.
Koch, B. J. (1983). Arabic Lexical Couplets and the Evolution of Synonymy. General Linguistics 23(1), 51-61.
Louw, B. (1993). Irony in the Text or Insincerity in the Writer - The Diagnostic Potential of Semantic Prosodie. In M. Baker, G. Francis and E. Tognini-Bonelli (eds) Text and Technology, John Benjamins, Amsterdam.
Malkiel, Y. (1959). Studies in Irreversible Binomials. Lingua 8, 113-160.
Mellinkoff, D. (1963). The Language of the Law. Little, Brown & Co, Boston.
Milic, L. T. (1967). A Quantitative Approach to the Style of Jonathan Swift. Mouton & Co, The Hague.
Moon, R. (1998). Fixed Expressions and Idioms in English. Clarendon Press, Oxford.
Nash, W. (1958) Paired Words in Othello: Shakespeare's Use of a Stylistic Device. English Studies 39, 212-214.
Norrick, N. R. (1980). Semantic Relations and Motivation in Idioms. In E. Weigand and G. Tschauder (eds) Perspektive: Textintern Vol. 1, Niemeyer, Tübingen.
Ohmann, R. M. (1962). Shaw: The Style and the Man. Wesleyan University Press, Middletown.
Potter, S. (1972). Chaucer's Untransposable Binomials. In E. Ohmann, V. Vaananen and A. Kurvinen (eds) Studies Presented to Tauno F. Mustanoja on the occasion of his sixtieth birthday, Modern Language Society, Helsinki.
Sinclair, J. (1996). The Search for Units of Meaning. Textus 9, 75-106.
Tilgner, E. (1936). Die Aureate Terms als Stilelement bei Lydgate. Germanische Studien 182.
Tuchman, G. (1978). Making News. The Free Press, New York.
van Dijk, T. A. (1988). News Analysis. Lawrence Erlbaum Associates, New Jersey.

If this content appears in violation of your intellectual property rights, or you see errors or omissions, please reach out to Scott B. Weingart to discuss removing or amending the materials.

Conference Info

In review

ACH/ALLC / ACH/ICCH / ALLC/EADH - 2000

Hosted at University of Glasgow

Glasgow, Scotland, United Kingdom

July 21, 2000 - July 25, 2000

104 works by 187 authors indexed

Affiliations need to be double-checked.

Conference website: https://web.archive.org/web/20190421230852/https://www.arts.gla.ac.uk/allcach2k/

Series: ALLC/EADH (27), ACH/ICCH (20), ACH/ALLC (12)

Organizers: ACH, ALLC

Tags
  • Keywords: None
  • Language: English
  • Topics: None