Computational linguistic tools for semi-automatic corpus-based updating of dictionaries

Ulrich Heid; Wolfgang Worsch; Stefan Evert; Vincent Docherty; Matthias Wermke

Computational linguistic tools for semi-automatic corpus-based updating of dictionaries

By admynNovember 17, 2016Euralex 2000, Publications

Page	183-195
Author	Ulrich Heid, Wolfgang Worsch, Stefan Evert, Vincent Docherty, Matthias Wermke
Title	Computational linguistic tools for semi-automatic corpus-based updating of dictionaries
Abstract	We will demonstrate an interface which allows the lexicographer to view the results of an automatic comparison of lexicographic descriptions from existing German dictionaries with corpus data. The second part of the paper will discuss in detail the use made of the raw material in the recent update of Langenscheidt’s Großwörterbuch Deutsch – Englisch, Der kleine Muret-Sanders. The examples in the online-demonstration come from work on entries for headwords with the initial letter “T” in Duden. Das große Wörterbuch der deutschen Sprache (8 vols.; Duden GWDS) and from the German part of Langenscheidts Handwörterbuch Deutsch-Englisch (HWB). Both have been compared with data extracted from large newspaper corpora. The interface makes use of a standard web browser for display of lexical data. The demonstration will be a guided tour of the data collection, from the lexicographic point of view. The first part of this paper provides the metalexicographic baseline, a short summary of the technology used to develop the data collection, a few examples of the types of data made available. The second part deals with the practical lexicographic use of the data collection in the update of Der kleine Muret-Sanders.
Session	PART 4 - Corpus-based Dictionary Making
Keywords
BibTex	@InProceedings{ELX00-023, author = {Ulrich Heid, Wolfgang Evert, Stefan Docherty, Vincent Wermke, Matthias Worsch}, title = {Computational linguistic tools for semi-automatic corpus-based updating of dictionaries}, pages = {183-195}, booktitle = {Proceedings of the 9th EURALEX International Congress}, year = {2000}, month = {aug}, date = {8-12}, address = {Stuttgart, Germany}, editor = {Ulrich Heid, Stefan Evert, Egbert Lehmann, Christian Rohrer}, publisher = {Institut für Maschinelle Sprachverarbeitung}, isbn = {3-00-006574-1}, }
Download

Computational linguistic tools for semi-automatic corpus-based updating of dictionaries

Contact data

EURALEX address

EURALEX is supported by

Quick message