Semantic Annotation of Verbs for the Tatar Corpus

By November 23, 2016,
Page340-347
AuthorAlfiya Galieva, Olga Nevzorova
TitleSemantic Annotation of Verbs for the Tatar Corpus
AbstractThis paper discusses the problem of developing the metalanguage for linguistic applications and introduces a tag set for the semantic annotation of verbs for the Tatar National Corpus. At present, there are no generally accepted standards for the development of corpus semantic annotation. In many cases it is made by individual researchers or teams for one or another research project, and characteristics of tag sets used in thesauri and electronic corpora differ in many respects. Using available semantic classifications of vocabulary for different languages and relying upon data from Tatar lexicons, we created a model of the semantic system of Tatar verbs and divided them into semantic classes (3,200 words). We distinguished semantic tags of two types: constructional (categorial) tags, independent of semantic classes of verbs, and semantic (thematic) tags, determining semantic classes of verbs. For separating these classes we used the hierarchical and the overlapping classifications, so the same verb may belong to more than one class. The approach is based on the data from explanatory dictionaries of the Tatar language, bilingual Russian-Tatar dictionaries and the system of semantic annotation of the Russian National Corpus. In the current version of our semantic annotation we use 3 categorial and 59 thematic tags.
SessionLexicography and Corpus Linguistics
KeywordsTatar verb; semantics; corpus; semantic annotation
BibTex
@InProceedings{ELX2016-036,
author={Alfiya Galieva, Olga Nevzorova},
title={Semantic Annotation of Verbs for the Tatar Corpus},
pages={340-347},
booktitle={Proceedings of the 17th EURALEX International Congress},
year={2016},
month={sep},
date={6-10},
address={Tbilisi, Georgia},
editor={Tinatin Margalitadze, George Meladze},
publisher={Ivane Javakhishvili Tbilisi University Press},
isbn={978-9941-13-542-2},
}
Download