A Quantitative Evaluation of Word Sketches

AuthorAdam Kilgarriff, Vojtěch Kovář, Simon Krek, Irena Srdanovic, Carole Tiberius
AbstractA word sketch is an automatic corpus-derived summary of a word’s grammatical and collocational behaviour. Word sketches were first prepared in 1999 for the compilation of the Macmillan English Dictionary for Advanced Learners (Rundell 2002). They have since been integrated into the Sketch Engine corpus query tool (Kilgarriff et al. 2004), prepared for fifteen languages, and used on a large scale for lexicography by a number of publishers. We are frequently told how impressive they are and how little they miss - but we would like a more rigorous assessment.
We describe a formal, quantitative evaluation of word sketches, from a user perspective, for four languages (Dutch, English, Slovene, Japanese), with the critical question being ‘is the collocation suitable for inclusion in a published collocation dictionary’. For each language, we inspected twenty collocates for each of forty-two headwords. In each case two thirds or more of the collocations were of publishable quality.
SessionComputational Lexicography and Lexicology
