Parallel corpora as a source of defining language-specific lexical items

By November 23, 2016,
Page 394-401
Author Dmitri Sitchinava
Title Parallel corpora as a source of defining language-specific lexical items
Abstract The paper presents an attempt to propose an exact method for identifying the so-called “language-specific” lexicon, a controversial notion often reasonably questioned. An aligned bilingual parallel corpus is chosen as an instrument for finding “specificity”, and statistical entropy and other indices are used as markers of the dispersion of translation patterns (viz. stimuli). For example, a word can be deemed (maximally) language-specific if it occurs multiple times in a given bilingual corpus and is translated each time in a different way. A word is minimally (or simply not) language-specific if it is translated each time identically. Some problems relative to the application of this method are discussed. These data can be explicitly used in bilingual dictionaries.
Session Lexicography and Corpus Linguistics
Keywords parallel corpora; language-specific lexicon; translation patterns; statistics
BibTex
@InProceedings{ELX2016-042,
author={Dmitri Sitchinava},
title={Parallel corpora as a source of defining language-specific lexical items},
pages={394-401},
booktitle={Proceedings of the 17th EURALEX International Congress},
year={2016},
month={sep},
date={6-10},
address={Tbilisi, Georgia},
editor={Tinatin Margalitadze, George Meladze},
publisher={Ivane Javakhishvili Tbilisi University Press},
isbn={978-9941-13-542-2},
}
Download