By November 17, 2016,
AuthorKarin Cavallin
AbstractMany areas of linguistics which use corpora as their main data have benefited from research in natural language processing, NLP. Apart from a few recent studies such as Sagi et al. (2009), Rohrdantz et al. (2011) and the GoogleNgram-viewer (Michel et al. 2011), the field of semantic change seems to have received little attention in NLP. This paper describes some first steps in viewing semantic change in terms of distributional semantics with a computational and linguistically motivated approach. By parsing, adding lemmatization and part of speech information, a method is developed to describe semantic behavior and to track semantic change over time. In distributional semantics, meaning is characterized with respect to the context. This idea is developed from Firth (1957) and is formulated according to ‘the distributional hypothesis’ of Harris (1968). Whereas most approaches to statistical semantics uses some kind of vector analysis based on ngrams. Distribution here is presented as the statistically ranked lists of verb-object constructions, that is ‘lexical sets’. A lexical set is more focused than ngrams and can be seen as essential minimal co-occurrence information for a given word, which facilitates manual analysis.
Keywordslexical sets, semantic change, language technology
