Abstract |
Automatic Keyword extraction is now a mature language technology. It enables the annotation of large amount of documents for content-gathering, indexing, searching and for its identification, in general. The reliability of results when processing documents in a multilingual environment, however, is still a challenge, particularly when documents are not limited to one specific semantic domain. The use of multi-term descriptors seems to be a good mean to identify the content. According to our previous evaluations (Panunzi et al. 2006a, 2006b), the availability of multi-term keywords increases the performance with respect to mono-term keywords of 100% relative factor. The LABLITA tool presented in this demo works now in a multilingual environment, as well. The demo calculates on the fly the number of mono-term and multiword keywords of parallel documents in English, Italian, German, French and Spanish, and will allow the audience to judge: a) the enhancement bared by multiword keywords for the identification of content; and b) the comparability of performance obtained by the tool processing different languages. |
BibTex |
@InProceedings{ELX08-031, author = {Alessandro Panunzi, Marco Fabbri, Massimo Moneglia}, title = {Multilingual Open Domain Key-word Extractor Proto-type}, pages = {463-468}, booktitle = {Proceedings of the 13th EURALEX International Congress}, year = {2008}, month = {jul}, date = {15-19}, address = {Barcelona, Spain}, editor = {Elisenda Bernal, Janet DeCesaris}, publisher = {Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra}, isbn = {978-84-96742-67-3}, } |