A Quantitative Evaluation of Word Sketches

By November 17, 2016,
Page 372-379
Author Adam Kilgarriff, Vojtěch Kovář, Simon Krek, Irena Srdanovic, Carole Tiberius
Title A Quantitative Evaluation of Word Sketches
Abstract A word sketch is an automatic corpus-derived summary of a word’s grammatical and collocational behaviour. Word sketches were first prepared in 1999 for the compilation of the Macmillan English Dictionary for Advanced Learners (Rundell 2002). They have since been integrated into the Sketch Engine corpus query tool (Kilgarriff et al. 2004), prepared for fifteen languages, and used on a large scale for lexicography by a number of publishers. We are frequently told how impressive they are and how little they miss - but we would like a more rigorous assessment.
We describe a formal, quantitative evaluation of word sketches, from a user perspective, for four languages (Dutch, English, Slovene, Japanese), with the critical question being ‘is the collocation suitable for inclusion in a published collocation dictionary’. For each language, we inspected twenty collocates for each of forty-two headwords. In each case two thirds or more of the collocations were of publishable quality.
Session Computational Lexicography and Lexicology
author = {Adam Kilgarriff, Vojtech Kovár, Simon Krek, Irena Srdanovic, Carole Tiberius},
title = {A Quantitative Evaluation of Word Sketches},
pages = {372-379},
booktitle = {Proceedings of the 14th EURALEX International Congress},
year = {2010},
month = {jul},
date = {6-10},
address = {Leeuwarden/Ljouwert, The Netherlands},
editor = {Anne Dykstra and Tanneke Schoonheim},
publisher = {Fryske Akademy},
isbn = {978-90-6273-850-3},