Building a Gold Standard for a Russian Collocations Database

Maria Khokhlova

Building a Gold Standard for a Russian Collocations Database

By Iztok KosemAugust 29, 2018Euralex 2018, Publications

Page	863-869
Author	Maria Khokhlova
Title	Building a Gold Standard for a Russian Collocations Database
Abstract	In the last decade, linguists have become increasingly interested in corpus material, which allows for a fresh approach to the phenomena that have already been extensively described in academic works. The dual nature of the co-occurrence phenomenon itself lies, on one hand, in its linguistic component and, on the other, in the probabilistic (combinatorial) characteristics. The former has been described in numerous papers and explicitly defined in dictionaries, while the latter can be identified by a statistical approach. The present paper focuses on the process of building a gold standard that will include data from Russian dictionaries and corpora. The standard is being prepared for a Russian Collocations Database that already includes information on words’ collocability and was extracted from text corpora by statistical measures and linguistic filters. The gold standard will be also used for the evaluation of the extracted collocations and for marking them as “true” collocations with references to the dictionaries.
Session	Poster Presentations
Keywords	database, collocations, corpora, dictionaries, Russian language
BibTex	@InProceedings{ELX2018-073, author={Maria Khokhlova}, title={Building a Gold Standard for a Russian Collocations Database}, pages={863-869}, booktitle={Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts}, year={2018}, month={jul}, date={17-21}, address={Ljubljana, Slovenia}, editor={Jaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek}, publisher={Ljubljana University Press, Faculty of Arts}, isbn={978-961-06-0097-8}, }
Download

Building a Gold Standard for a Russian Collocations Database

Contact data

EURALEX address

EURALEX is supported by

Quick message