Corpus as a Means for Study of Lexical Usage Changes

Michal Křen; Jaroslava Hlaváčová

Corpus as a Means for Study of Lexical Usage Changes

By admynNovember 17, 2016Euralex 2008, Publications

Page	437-447
Author	Michal Křen, Jaroslava Hlaváčová
Title	Corpus as a Means for Study of Lexical Usage Changes
Abstract	The paper presents a corpus-based method for obtaining ranked wordlists that can characterise lexical usage changes. The method is evaluated on two 100-million representatively balanced corpora of contemporary written Czech that cover two consecutive time periods. Despite similar overall design of the corpora, lexical frequencies have to be first normalised in order to achieve comparability. Furthermore, dispersion information is used to reduce the number of domain-specific items, as their frequencies highly depend on inclusion of particular texts into the corpus. Statistical significance measures are finally used for evaluation of frequency differences between individual items in both corpora. It is demonstrated that the method ranks the resulting wordlists appropriately and several limitations of the approach are also discussed. Influence of corpora composition cannot be completely obliterated and comparability of the corpora is shown to play a key role. Therefore, although highly-ranked items are often found to be related to changes of language usage, their relevance should be cautiously interpreted. In addition to several general language words, the real examples of lexical variation are found to be limited mostly to temporary topics of public discourse or items reflecting recent technological development, thus sketching an overall picture of lifestyle changes.
Session	1. Computational Lexicography and Lexicology
Keywords
BibTex	@InProceedings{ELX08-028, author = {Michal Křen, Jaroslava Hlaváčová}, title = {Corpus as a Means for Study of Lexical Usage Changes}, pages = {437-447}, booktitle = {Proceedings of the 13th EURALEX International Congress}, year = {2008}, month = {jul}, date = {15-19}, address = {Barcelona, Spain}, editor = {Elisenda Bernal, Janet DeCesaris}, publisher = {Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra}, isbn = {978-84-96742-67-3}, }
Download

Corpus as a Means for Study of Lexical Usage Changes

Contact data

EURALEX address

EURALEX is supported by

Quick message