A Description of Texts in a Corpus: ‘Virtual’ and ‘Real’ Corpora

By November 17, 2016,
Page 390-402
Author Paul Holmes-Higgin, Khurshid Ahmad, Syed Sibte Raza Abidi
Title A Description of Texts in a Corpus: ‘Virtual’ and ‘Real’ Corpora
Abstract The extensive use of computer-based corpora for a range of language studies has led to the proliferation of the ways in which texts within an individual corpus are organised. Basically, the organisation reflects the immediate needs of a group of well motivated users, like lexicographers or terminologists. This means that the subsequent generation of corpus users are forced to use a classification of texts according to categories they may not be familiar with or may not be comfortable with or both. There is an urgent need to have a facility in corpus management systems that allows the users to have their own classification system to categorise texts in a corpus. That is, the users should be able to choose, for example, their own style, register, field, time-span and author attributes for generating word lists, concordances, contextual examples and so on. A component of a lexicography and terminology management system, System Quirk, is described that can support such a virtual organisation of texts within a corpus.
Session PART 3 - Lexicographical and lexicological projects
author = {Paul Holmes-Higgin, Khurshid Ahmad, Syed Sibte Raza Abidi},
title = {A Description of Texts in a Corpus: 'Virtual' and 'Real' Corpora},
pages = {390-402},
booktitle = {Proceedings of the 6th EURALEX International Congress},
year = {1994},
month = {aug-sep},
date = {30-3},
address = {Amsterdam, the Netherlands},
editor = {Willy Martin, Willem Meijs, Margreet Moerland, Elsemiek ten Pas, Piet van Sterkenburg & Piek Vossen},
publisher = {Euralex},
isbn = {90-900-7537-2},