Standards for lexicons and corpora -- Areas, interaction between lexicon and corpus, current state of EAGLES

  1. 1. Nicoletta Calzolari

    Istituto di Linguistica Computazionale (ILC) (Institute for Computational Linguistics) - Consiglio Nazionale delle Ricerche (CNR)

When we highlight the complex structure of the interrelationships between lexicon and corpus, we have to work on the assumption of an interdependence between the two views, and we have to take into account this interdependence in any lexical or corpus analysis or application.

This was also the approach taken within the LRE EAGLES (see Calzolari, McNaught, EAGLES Editors' Introduction, 1996) project towards the development of standards both in Morphosyntax and Syntax: the awareness of the interdependence between lexical specifications on the one hand, and corpus tagsets/syntactic annotations on the other, has guided the formulation of the proposals for standards and recommendations in both the Corpus and the Lexicon Work Groups of EAGLES. Corpus tagging/annotating was considered as the first obvious application of a Computational Lexicon. Therefore attention was given to the definition of compatible sets of attributes and values.

The presentation will address problems of the interaction between the two types of resources, corpora and lexicons, in particular from the perspective of a standardization project.

