Working with the web as a source for dictionaries of informal vocabulary

By November 17, 2016,
Page 357-363
Author Håkan Jansson
Title Working with the web as a source for dictionaries of informal vocabulary
Abstract Informal vocabulary, e.g. slang, jargon and other forms of expression that are particular to different types of small or closed groups, is usually suppressed in writing that has passed an editorial process. That is to say with at least one important exception: the dialogues in works of fiction. This means that this type of vocabulary is not so readily gathered for the purpose of lexicon-making. Or this has nevertheless been the case up until recent years. But the constant stream of linguistic diversity on the Internet has given us new possibilities to tap into to the flow of colloquial and informal language.
The aim of this presentation is foremost to give a brief account of how the Internet could be ‘harvested’ for the purpose of creating corpora which include substantial amounts of informal language, and secondly, how to use these (in this case Swedish and Icelandic) corpora to gather candidates for headwords with informal markings such as coll., slang, and the like.
Session Computational Lexicography and Lexicology
Keywords
BibTex
@InProceedings{ELX10-023,
author = {Hakan Jansson},
title = {Working with the web as a source for dictionaries of informal vocabulary},
pages = {357-363},
booktitle = {Proceedings of the 14th EURALEX International Congress},
year = {2010},
month = {jul},
date = {6-10},
address = {Leeuwarden/Ljouwert, The Netherlands},
editor = {Anne Dykstra and Tanneke Schoonheim},
publisher = {Fryske Akademy},
isbn = {978-90-6273-850-3},
}
Download