Towards a corpus-based dictionary of German noun-verb collocations

By November 17, 2016,
Page 301-312
Author Ulrich Heid
Title Towards a corpus-based dictionary of German noun-verb collocations
Abstract We describe our attempts to automatically extract raw material for a dictionary of German noun-verb collocations from large corpora of newspaper text. Such a dictionary should be about collocations and it should include a description of their linguistic properties, rather than listing the mere lexical cooccurrence. Since most statistical collocation finding tools do not provide other than lexical cooccurrence information, we first use symbolic extraction tools, based on a regular grammar over part-of-speech tagged and lemmatized text, and we use statistical filters thereafter. We first list the types of information which should be contained in a collocational dictionary for Natural Language Processing, then sketch our extraction methods and finally discuss and illustrate our initial results.
Session PART 3 - Lexical Combinatorics
Keywords Collocations, text corpora, semi-automatic lexical acquisition.
BibTex
@InProceedings{ELX98_1-036,
author = {Ulrich Heid},
title = {Towards a corpus-based dictionary of German noun-verb collocations},
pages = {301-312},
booktitle = {Proceedings of the 8th EURALEX International Congress},
year = {1998},
month = {aug},
date = {4-8},
address = {Liège, Belgium},
editor = {Thierry Fontenelle, Philippe Hiligsmann, Archibald Michiels, André Moulin, Siegfried Theissen},
publisher = {Euralex},
isbn = {2-87233-091-7},
}
Download