Unified Data Modelling for Presenting Lexical Data: The Case of EKILEX

Arvi Tavast; Margit Langemets; Jelena Kallas; Kristina Koppel

Unified Data Modelling for Presenting Lexical Data: The Case of EKILEX

By Iztok KosemAugust 29, 2018Euralex 2018, Publications

Page	749-761
Author	Arvi Tavast, Margit Langemets, Jelena Kallas, Kristina Koppel
Title	Unified Data Modelling for Presenting Lexical Data: The Case of EKILEX
Abstract	The Institute of the Estonian Language is developing EKILEX, a new dictionary writing system for both semasiological dictionaries and onomasiological termbases. While the long-term vision is to have a single data source that provides consistent information about Estonian, the system also needs to cope with the multitude of existing datasets. In this paper, we present work in progress on modelling the data and importing an initial sample of legacy dictionaries. The data model is based on an m:n relation between words and meanings, which are both unified across dictionaries, even while there still are separate dictionaries in the system. What is dictionary-specific is only the mapping between word and meaning. The importing of dictionaries has revealed various issues with data quality: ambiguities, underspecification, inconsistencies and conflicts. These need to be dealt with, if the long-term vision is to be achieved. We also outline the next steps of human- and machine-readable publishing, corpus connection and quantification (frequency, salience measures, etc.).
Session	VARIOUS TOPICS
Keywords	data modelling, dictionary portal, interoperability, linked data, Estonian
BibTex	@InProceedings{ELX2018-061, author={Arvi Tavast, Margit Langemets, Jelena Kallas, Kristina Koppel}, title={Unified Data Modelling for Presenting Lexical Data: The Case of EKILEX}, pages={749-761}, booktitle={Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts}, year={2018}, month={jul}, date={17-21}, address={Ljubljana, Slovenia}, editor={Jaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek}, publisher={Ljubljana University Press, Faculty of Arts}, isbn={978-961-06-0097-8}, }
Download

Unified Data Modelling for Presenting Lexical Data: The Case of EKILEX

Contact data

EURALEX address

EURALEX is supported by

Quick message