Interlinking Slovene Language Datasets

Page 73-80
Author Lenka Bajčetić, Thierry Declerck
Title Interlinking Slovene Language Datasets
Abstract We present the current implementation state of our work consisting in interlinking language data and linguistic information included in different types of Slovenian language resources. The types of resources we currently deal with are a lexical database (which also contains collocations and example sentences), a morphological lexicon, and the Slovene WordNet. We first transform the encoding of the original data into the OntoLex-Lemon model and map the different descriptors used in the original sources onto the LexInfo vocabulary. This harmonization step is enabling the interlinking of the various types of information included in the different resources, by using relations defined in OntoLex-Lemon. As a result, we obtain a partial merging of the information that was originally distributed over different resources, which is leading to a cross-enrichment of those original data sources. A final goal of the presented work is to publish the linked and merged Slovene linguistic datasets in the Linguistic Linked Open Data cloud.
Session Lexicography and Language Technologies
Keywords Slovenian Language Data; interlinking; OntoLex-Lemon; LexInfo
BibTex
@inproceedings{ELX2020_2021-007,
address = {Alexandroupolis},
title = {Interlinking {Slovene} {Language} {Datasets}},
isbn = {978-618-85138-1-5},
url = {https://www.euralex.org/elx_proceedings/Euralex2020-2021/EURALEX2020-2021_Vol1-p073-080.pdf},
language = {en},
booktitle = {Lexicography for {Inclusion}: {Proceedings} of the 19th {EURALEX} {International} {Congress}, 7-9 {September} 2021, {Alexandroupolis}, {Vol}. 1},
publisher = {Democritus University of Thrace},
author = {Bajčetić, Lenka and Declerck, Thierry},
editor = {Gavriilidou, Zoe and Mitsiaki, Maria and Fliatouras, Asimakis},
year = {2020},
pages = {73--80},}
Download