A Description of Texts in a Corpus: ‘Virtual’ and ‘Real’ Corpora

AuthorPaul Holmes-Higgin, Khurshid Ahmad, Syed Sibte Raza Abidi
AbstractThe extensive use of computer-based corpora for a range of language studies has led to the proliferation of the ways in which texts within an individual corpus are organised. Basically, the organisation reflects the immediate needs of a group of well motivated users, like lexicographers or terminologists. This means that the subsequent generation of corpus users are forced to use a classification of texts according to categories they may not be familiar with or may not be comfortable with or both. There is an urgent need to have a facility in corpus management systems that allows the users to have their own classification system to categorise texts in a corpus. That is, the users should be able to choose, for example, their own style, register, field, time-span and author attributes for generating word lists, concordances, contextual examples and so on. A component of a lexicography and terminology management system, System Quirk, is described that can support such a virtual organisation of texts within a corpus.
SessionPART 3 - Lexicographical and lexicological projects
