Multilingual Open Domain Key-word Extractor Proto-type

By November 17, 2016,
Page 463-468
Author Alessandro Panunzi, Marco Fabbri, Massimo Moneglia
Title Multilingual Open Domain Key-word Extractor Proto-type
Abstract Automatic Keyword extraction is now a mature language technology. It enables the annotation of large amount of documents for content-gathering, indexing, searching and for its identification, in general. The reliability of results when processing documents in a multilingual environment, however, is still a challenge, particularly when documents are not limited to one specific semantic domain. The use of multi-term descriptors seems to be a good mean to identify the content. According to our previous evaluations (Panunzi et al. 2006a, 2006b), the availability of multi-term keywords increases the performance with respect to mono-term keywords of 100% relative factor. The LABLITA tool presented in this demo works now in a multilingual environment, as well. The demo calculates on the fly the number of mono-term and multiword keywords of parallel documents in English, Italian, German, French and Spanish, and will allow the audience to judge: a) the enhancement bared by multiword keywords for the identification of content; and b) the comparability of performance obtained by the tool processing different languages.
Session 1. Computational Lexicography and Lexicology
author = {Alessandro Panunzi, Marco Fabbri, Massimo Moneglia},
title = {Multilingual Open Domain Key-word Extractor Proto-type},
pages = {463-468},
booktitle = {Proceedings of the 13th EURALEX International Congress},
year = {2008},
month = {jul},
date = {15-19},
address = {Barcelona, Spain},
editor = {Elisenda Bernal, Janet DeCesaris},
publisher = {Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra},
isbn = {978-84-96742-67-3},