The Taming of the Polysemy: Automated Word Sense Frequency Estimation for Lexicographic Purposes

By November 23, 2016,
Page 249-256
Author Anastasiya Lopukhina, Konstantin Lopukhin, Boris Iomdin, Grigory Nosyrev
Title The Taming of the Polysemy: Automated Word Sense Frequency Estimation for Lexicographic Purposes
Abstract Although word sense frequency information is important for theoretical study of polysemy and practical purposes of lexicography, the problem of sense frequency distribution is a neglected area in linguistics. It is probably because sense frequency is not easy to estimate. In this paper we deal with the problem of automated word sense frequency estimation for Russian nouns. We developed and tested an automated system based on semantic context vectors, supplied with contexts and collocations from the Active Dictionary of Russian – a full-fledged production dictionary that reflects contemporary Russian. The study was performed on RuTenTen11 web-corpus. This allows us to reach a frequency estimation error of 11% without any additional labelled data. We compared sense frequencies obtained automatically with sense ordering in different dictionaries for several words. The method presented in this paper can be applied to any language with a sufficiently large corpus and a good dictionary that provides examples for each sense. The results may enrich language learning resources and help lexicographers order senses within a word according to frequency if needed.
Session Lexicography and Language Technologies
Keywords semantics; lexicography; word sense frequency; web corpora; polysemy; frequency; semantic vectors; word sense disambiguation; WSD
BibTex
@InProceedings{ELX2016-024,
author={Anastasiya Lopukhina, Konstantin Lopukhin, Boris Iomdin, Grigory Nosyrev},
title={The Taming of the Polysemy: Automated Word Sense Frequency Estimation for Lexicographic Purposes},
pages={249-256},
booktitle={Proceedings of the 17th EURALEX International Congress},
year={2016},
month={sep},
date={6-10},
address={Tbilisi, Georgia},
editor={Tinatin Margalitadze, George Meladze},
publisher={Ivane Javakhishvili Tbilisi University Press},
isbn={978-9941-13-542-2},
}
Download