The Project of Korpus 2000 Going Public

By November 17, 2016,
Page 291-299
Author Mette Skovgaard Andersen, Helle Asmussen, Jørg Asmussen
Abstract Among experts, corpora have become widely accepted and appreciated as an indispensable resource for lexicographic and NLP purposes. Laymen (or non-experts), however, seem to know very little about publicly available corpora and the advantages of using these in conjunction with dictionaries and as a means of linguistic inspiration. Thus in Denmark, the use of corpora till now has been limited to a small group of people with specific linguistic interests. The Society for Danish Language and Literature, DSL, has a long tradition of creating and using corpora for lexicographic purposes for instance for the creation of The Danish Dictionary which will be published 2002-2003. The present paper discusses some of the aspects of a corpus project at DSL called Korpus 2000. The project aims at creating a relatively balanced corpus of general text from the years 1998-2002 documenting Danish around the turn of the millennium. Korpus 2000 will be made publicly available on the internet and one of the main purposes ofthe project is to increase laymen's awareness ofthe advantages of corpora. This paper focuses on aspects of designing a corpus, planning a corpus layout and presenting the project keeping this target group of non-experts in mind.
Session Reports on Lexicographical and Lexicological Projects
