The Taming of the Polysemy: Automated Word Sense Frequency Estimation for Lexicographic Purposes

By November 23, 2016,
Page249-256
AuthorAnastasiya Lopukhina, Konstantin Lopukhin, Boris Iomdin, Grigory Nosyrev
TitleThe Taming of the Polysemy: Automated Word Sense Frequency Estimation for Lexicographic Purposes
AbstractAlthough word sense frequency information is important for theoretical study of polysemy and practical purposes of lexicography, the problem of sense frequency distribution is a neglected area in linguistics. It is probably because sense frequency is not easy to estimate. In this paper we deal with the problem of automated word sense frequency estimation for Russian nouns. We developed and tested an automated system based on semantic context vectors, supplied with contexts and collocations from the Active Dictionary of Russian – a full-fledged production dictionary that reflects contemporary Russian. The study was performed on RuTenTen11 web-corpus. This allows us to reach a frequency estimation error of 11% without any additional labelled data. We compared sense frequencies obtained automatically with sense ordering in different dictionaries for several words. The method presented in this paper can be applied to any language with a sufficiently large corpus and a good dictionary that provides examples for each sense. The results may enrich language learning resources and help lexicographers order senses within a word according to frequency if needed.
SessionLexicography and Language Technologies
Keywordssemantics; lexicography; word sense frequency; web corpora; polysemy; frequency; semantic vectors; word sense disambiguation; WSD
BibTex
@InProceedings{ELX2016-024,
author={Anastasiya Lopukhina, Konstantin Lopukhin, Boris Iomdin, Grigory Nosyrev},
title={The Taming of the Polysemy: Automated Word Sense Frequency Estimation for Lexicographic Purposes},
pages={249-256},
booktitle={Proceedings of the 17th EURALEX International Congress},
year={2016},
month={sep},
date={6-10},
address={Tbilisi, Georgia},
editor={Tinatin Margalitadze, George Meladze},
publisher={Ivane Javakhishvili Tbilisi University Press},
isbn={978-9941-13-542-2},
}
Download