Determining Differences of Granularity between Cross-Dictionary Linked Senses

Page 109-118
Author Eirini Kouvara, Meritxell Gonzàlez, Julian Grosse, Roser Saurí
Title Determining Differences of Granularity between Cross-Dictionary Linked Senses
Abstract Linking dictionaries at the sense level is highly beneficial because it facilitates the mutual enhancement of the linked datasets or the possibility of deriving new products from the combination of the two. However, one of the greatest challenges in cross-dictionary sense linking is that linked senses, although referring to the same meaning, may actually differ in their semantic extent due to dictionary distinctions of sense granularity. Not every pair of linked senses is therefore qualitatively the same. However, being able to identify and classify these differences is a crucial step towards enabling the comprehensive exploitation of sense-linked datasets. In this paper, we present a system to automatically identify the relation of sense links between a bilingual and a monolingual dictionary. Using sense granularity annotations by lexicographers as the gold standard, we trained a machine learning model to classify the relation between cross-dictionary linked senses as one of the following categories: perfect, where each sense fully covers the other sense; wider/narrower, where one sense fully encloses the other but not vice versa; partial, where each sense partially covers the other sense. Cross-validation shows the machine learning model to yield an overall accuracy of 86%, with a macro precision of 83% and a macro recall of 65% across the different classes. The model significantly outperforms a rule-based algorithm serving as the baseline.
Session Lexicography and Language Technologies
Keywords sense granularity; word sense linking; word sense mapping; lexical resources; language data generation; multilingual data; data integration across languages
address = {Alexandroupolis},
title = {Determining {Differences} of {Granularity} between {Cross}-{Dictionary} {Linked} {Senses}},
isbn = {978-618-85138-1-5},
url = {},
language = {en},
booktitle = {Lexicography for {Inclusion}: {Proceedings} of the 19th {EURALEX} {International} {Congress}, 7-9 {September} 2021, {Alexandroupolis}, {Vol}. 1},
publisher = {Democritus University of Thrace},
author = {Kouvara, Eirini and Gonzàlez, Meritxell and Grosse, Julian and Saurí, Roser},
editor = {Gavriilidou, Zoe and Mitsiaki, Maria and Fliatouras, Asimakis},
year = {2020},
pages = {109--118},}