Skema: A New Tool for Corpus-driven Lexicography

Page 523-528
Author Vit Baisa, Carole Tiberius, Elisabetta Ježek, Lut Colman, Constanza Marini, Emma Romani
Title Skema: A New Tool for Corpus-driven Lexicography
Abstract In this paper, we describe the development of Skema and its features. Skema [?ski?m?] is a new corpus pattern editor system which supports the manual annotation of concordance lines with user-defined labels (each concordance has its own set of labels) and the editing of the corresponding patterns in terms of slots, attributes, examples and other features following the lexicographic technique of Corpus Pattern Analysis. Skema is integrated into the web-based Sketch Engine and can be used by any user for annotating both preloaded and user corpora. Each annotation label is linked to the pattern structure (stored in JSON format) which can be easily customized to individual projects, a generic pattern structure (i.e. a list of user-defined attributes) being available by default. The paper illustrates the use of Skema in three specific projects, i.e. Woordcombinaties for Dutch verbs, Typed Predicate-Argument Structures for Italian Verbs (T-PAS) and its sister project for Croatian Verbs (CROATPAS).
Session Lexicography and Corpus Linguistics
Keywords corpus-driven lexicography; editor, pattern dictionary; sketch engine, corpus annotation; annotation schema
