Abstract |
Morphosyntactic lexica are a very important resource for natural language processing. Many exist; some are freely available for research. But many organisms still produce lexica, even for languages with available resources. In this paper, we present some techniques that can be leveraged to produce lexica more efficiently. Firstly, the format of the lexicon is important. We use a very simple format based on the association of a lemma and a flexion rule, avoiding dozens of entries for a single lemma. Secondly, the linguist must describe some basic elements: the tag list, the tool words and the flexion rules. Thirdly, a specific guesser makes the completion of the lexicon easier. We describe two ways of adding entries to the lexicon using a guesser which associates a lemma and a flexion rule to a word, or a flexion rule to a lemma. |
BibTex |
@InProceedings{ELX08-015, author = {Claude de Loupy, Sandra Gonçalves}, title = {Aide a la construction de lexiques morphosyntaxiques}, pages = {331-337}, booktitle = {Proceedings of the 13th EURALEX International Congress}, year = {2008}, month = {jul}, date = {15-19}, address = {Barcelona, Spain}, editor = {Elisenda Bernal, Janet DeCesaris}, publisher = {Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra}, isbn = {978-84-96742-67-3}, } |