From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource

By November 17, 2016,
Page 431-446
Author Basilio Calderone, Nabil Hathout, Franck Sajous
Title From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource
Abstract In this paper, we present two French lexical resources, GLÀFF and PsychoGLÀFF. The former, automatically extracted from the collaborative online dictionary Wiktionary, is a large-scale versatile lexicon exploitable in Natural Language Processing applications and linguistic studies. The latter, based on GLÀFF, is a lexicon specifically designed for psycholinguistic research. GLÀFF, counting more than 1.4 million entries, features an unprecedented size. It reports lemmas, main syntactic categories, inflectional features and phonemic transcriptions. PsychoGLÀFF contains additional information related to formal aspects of the lexicon and its distribution. It contains about 340,000 entries (120,000 lemmas) that are corpora-attested. We explain how the resources have been created and compare them to other known resources in terms of coverage and quality. Regarding PsychoGLÀFF, the comparison shows that it has an exceptionally large repertoire while having a comparable quality.
Session Lexicography and Corpus Linguistics
Keywords French lexicon; lexical resource for psycholinguistic studies; Wiktionary
BibTex
@InProceedings{ELX2014-032,
author={Basilio Calderone and Nabil Hathout and Franck Sajous},
title={From GLÀFF to PsychoGLÀFF: a large psycholinguistics-oriented French lexical resource },
pages={431-446},
booktitle={Proceedings of the 16th EURALEX International Congress},
year={2014},
month={jul},
date={15-19},
address={Bolzano, Italy},
editor={Abel, Andrea and Vettori, Chiara and Ralli, Natascia},
publisher={EURAC research},
isbn={978-88-88906-97-3},
}
Download