MedLex+: An Integrated Corpus-Lexicon Medical Workbench for Swedish

Page 703-712
Author Dimitrios Kokkinakis, Maria Toporowska Gronostaj
Abstract This paper reports on the work carried out developing MedLex+, a medical corpuslexicon workbench for Swedish. This project, which is still under active development, has been going on for some years now within the Department of Swedish language at Göteborg University. At the moment, the workbench incorporates: - an annotated collection of medical texts - including 20 million tokens and 45,000 documents, - a number of language processing software programs, including tools for collocation extraction, compound segmentation and thesaurus-based semantic annotation, and - a lexical database of medical terms-containing 5,000 medical entries. MedLex+ is a multifunctional lexical resource due to a structural design and content which can be easily queried. The medical workbench is intended to support lexicographers compiling lexicons and also lexicon users more or less initiated in the medical domain. MedLex+ can also assist researchers working on either lexical semantics or natural language processing (NLP) applications with focus on medical language. The linguistically and semantically annotated medical texts in combination with a set of smart queries turn the corpora into a rich repository of semasiological and onomasiological knowledge about medical terms and their linguistic, lexical and pragmatic properties. These properties are recorded in the lexical database with a cognitive profile. The MedLex+ workbench seems to offer a constructive help inmany different lexical tasks.
Session 3. Reports on Lexicographical and Lexicological Projects
