Towards a corpus-based dictionary of German noun-verb collocations

Page301-312
AuthorUlrich Heid
TitleTowards a corpus-based dictionary of German noun-verb collocations
AbstractWe describe our attempts to automatically extract raw material for a dictionary of German noun-verb collocations from large corpora of newspaper text. Such a dictionary should be about collocations and it should include a description of their linguistic properties, rather than listing the mere lexical cooccurrence. Since most statistical collocation finding tools do not provide other than lexical cooccurrence information, we first use symbolic extraction tools, based on a regular grammar over part-of-speech tagged and lemmatized text, and we use statistical filters thereafter. We first list the types of information which should be contained in a collocational dictionary for Natural Language Processing, then sketch our extraction methods and finally discuss and illustrate our initial results.
SessionPART 3 - Lexical Combinatorics
KeywordsCollocations, text corpora, semi-automatic lexical acquisition.
BibTex
@InProceedings{ELX98_1-036,
author = {Ulrich Heid},
title = {Towards a corpus-based dictionary of German noun-verb collocations},
pages = {301-312},
booktitle = {Proceedings of the 8th EURALEX International Congress},
year = {1998},
month = {aug},
date = {4-8},
address = {Liège, Belgium},
editor = {Thierry Fontenelle, Philippe Hiligsmann, Archibald Michiels, André Moulin, Siegfried Theissen},
publisher = {Euralex},
isbn = {2-87233-091-7},
}
Download