TESAURVAI: Extraction, Annotation and Term Organization Tool

By November 17, 2016,
Page 941-946
Author Jesús Cardenosa, Carolina Gallardo Pérez, Ángeles Maldonado-Martínez, Jorge Vergara
Title TESAURVAI: Extraction, Annotation and Term Organization Tool
Abstract TESAURVAI is a tool for extracting, annotating and organizing terms from a collection of digital documents. The main contribution of TESAURVAI is the unification of a term extractor and a thesauri builder in the same tool. The term extractor identifies terms, words and phrases in the input digital texts that are transferred to the thesaurus builder. TESAURVAI follows the international standards for the construction and management of thesauri, and it provides the following facilities: on the one hand, it is a tool to create thesaurus from scratch, allowing for the extraction, creation, edition and annotation of terms, as well as providing a user-friendly interface for establishing relations between terms and performing basic or advanced searches of terms. On the other, it is a tool to manage several thesauri and to import and export existent thesauri from text or XML files. Finally, TESAURVAI can build alphabetical, hierarchical and permuted indexes to be printed or exported as reports. TESAURVAI has been developed in Java and requires and external database to store the user.s thesauri. The tool is compatible with any database manager provided with a Java Database Connectivity (JDBC) file, such as MySql or Postgres. This tool has been developed within the framework of the PATRILEX (HUM2005-07260/FILO) project, sponsored by the Spanish Minister of Education. Currently, TESAURVAI is in a provisional version. A new version of the tool, which will be accessible on the Internet, will be available in July 2008.
Session 5. Lexicography for Specialised Languages - Terminology and Terminography
Keywords
BibTex
@InProceedings{ELX08-089,
author = {Jesús Cardenosa, Carolina Gallardo Pérez, Ángeles Maldonado-Martínez, Jorge Vergara },
title = {TESAURVAI: Extraction, Annotation and Term Organization Tool},
pages = {941-946},
booktitle = {Proceedings of the 13th EURALEX International Congress},
year = {2008},
month = {jul},
date = {15-19},
address = {Barcelona, Spain},
editor = {Elisenda Bernal, Janet DeCesaris},
publisher = {Institut Universitari de Linguistica Aplicada, Universitat Pompeu Fabra},
isbn = {978-84-96742-67-3},
}
Download