Extension of a Specialised Lexicon Using Specific Terminological Data

By November 17, 2016,
AuthorBruno Cartoni, Pierre Zweigenbaum
TitleExtension of a Specialised Lexicon Using Specific Terminological Data
AbstractStatus language planning has been one of the components of post-apartheid South Africa’s transformation project that has managed to attract wide-spread attention. In 1994 South Africa moved from its former official bilingual language policy to a new constitution that enshrines official status to 11 of the languages spoken in South Africa. However, 16 years down the line there is widespread disappointment with organized language planning and management by government authorized agencies. The paper gives a brief analysis of terminology development in contemporary South Africa juxtaposed with a terminology development project at the micro level which, in Joshua Fishman’s words, was initiated from the perspective of ‘not leaving your language alone’
.The practice of translation is an age-old activity, but translation studies is a fairly 'new' academic discipline and hence its terminology is still in its infancy. Translation studies has been taught in South Africa at higher education institutions for more than thirty years, but mainly through the medium of English and Afrikaans. The prod for this project was therefore the identification of fresh needs for terminology development in this area to contribute to facilitating the sustained development of specialized discourses in higher education. Terminology development is viewed as indispensable for creating and sustaining a dynamic environment for the use of South Africa’s official indigenous languages as a medium of instruction and ultimately for scientific progress. The paper describes methods for acquiring lexical information to implement a ‘Unified Medical Lexicon for French’ (UMLF) that aims at being a reference resource for NLP in the medical domain. We address four issues of lexical acquisition in a specialised domain. First, to assess the ‘desired coverage’ of lexical information, we use a large collection of French terms as a reference resource for the medical domain sublanguage. The collection contains close to 300,000 terms organised around conceptual identifiers. Second, by looking through this large amount of terminological data, we highlight the different kinds of information that might be useful to deal with typical terminological processing tasks, like variant recognition. The terminological variation phenomena that are very frequent in these terms are of three kinds: graphemic, inflectional and derivational variations. Third, we propose a model for organising the lexical information. Most of this model is inspired from existing specialist lexicons, but special emphasis is put on derivational morphological information. Finally, different kinds of acquisition methods are described, at the two levels of linguistic description that are addressed here: inflectional and derivational morphological knowledge. These methods allow acquiring an important amount of lexical data. For inflectional knowledge, the full paradigm is recorded, to provide information about all the possible inflected forms of lexical units within terms. Regarding derivational knowledge, specific derivation processes are targeted, in order to handle particular term variations. The relevance of the gathered derivational information is also assessed.
SessionLexicography for Specialised Languages – Terminology and Terminography
