The Influence of Corpora on Lexicons: Corpora Use in the Creation of COMLEX Syntax and NOMLEX

By November 17, 2016,
Page 141-148
Author Catherine MacLeod, Ralph Grishman
Title The Influence of Corpora on Lexicons: Corpora Use in the Creation of COMLEX Syntax and NOMLEX
Abstract It is now generally accepted that a text corpus plays an important role in the production of hard-copy dictionaries. In this paper, we discuss the influence a corpus can have on the creation of lexical resources for computer use. In the creation of COMLEX Syntax and NOMLEX, two on-line lexicons produced by the authors at New York University, we used two different corpora, one composed of a small (one million words) balanced corpus (the Brown Corpus) plus a large amount of newspaper data and the other, a large balanced corpus (100 million words) of British English (the British National Corpus). We point out how the use of these two corpora affected the resulting lexicons in different ways and to differing degrees and we suggest what we feel would have been the ideal corpus for our purposes.
Session PART 3 - Corpora, Tools and NLP Dictionaries
Keywords
BibTex
@InProceedings{ELX00-017,
author = {Catherine MacLeod, Ralph Grishman},
title = {The Influence of Corpora on Lexicons: Corpora Use in the Creation of COMLEX Syntax and NOMLEX},
pages = {141-148},
booktitle = {Proceedings of the 9th EURALEX International Congress},
year = {2000},
month = {aug},
date = {8-12},
address = {Stuttgart, Germany},
editor = {Ulrich Heid, Stefan Evert, Egbert Lehmann, Christian Rohrer},
publisher = {Institut für Maschinelle Sprachverarbeitung},
isbn = {3-00-006574-1},
}
Download