French Cross-disciplinary Scientific Lexicon: Extraction and Linguistic Analysis

By November 23, 2016,
Page 355-366
Author Sylvain Hatier, Magdalena Augustyn, Thi Thu Hoai Tran, Rui Yan, Agnès Tutin, Marie-Paule Jacques
Title French Cross-disciplinary Scientific Lexicon: Extraction and Linguistic Analysis
Abstract This paper presents the work we carried out to extract and structure a specialized lexicon based on a corpus of French scientific articles in the fields of humanities and social sciences. The characteristics of the targeted lexicon may be summarized as follows: it is not domain-related inasmuch as it is shared by various disciplines; it serves to express the specific operations, naming the objects and exposing the results of research processes. In this view, the targeted lexicon studies here is genre-related. We designed this cross-disciplinary scientific lexicon (CSL) as a resource for several purposes: it may serve natural language processing, e.g. as a stoplist for automatic terms identification, as well as foreign language teaching. Indeed, students and scholars in the sciences need to become familiar with the rhetoric of the research article, thus needing to master these words. We present here the two-stage creation of this lexicon: first, it was semi-automatically extracted from a corpus of 500 research articles spanning ten disciplines. Second, it was manually structured to reflect the semantics and rhetoric of science. This structure takes into account the lexico-syntactic properties of CSL nouns, adjectives, verbs and adverbs. The resource will be freely available for academic purposes.
Session Lexicography and Corpus Linguistics
Keywords corpus linguistics; academic writing; open lexical resources; natural language processing
BibTex
@InProceedings{ELX2016-038,
author={Sylvain Hatier, Magdalena Augustyn, Thi Thu Hoai Tran, Rui Yan, Agnès Tutin, Marie-Paule Jacques},
title={French Cross-disciplinary Scientific Lexicon: Extraction and Linguistic Analysis},
pages={355-366},
booktitle={Proceedings of the 17th EURALEX International Congress},
year={2016},
month={sep},
date={6-10},
address={Tbilisi, Georgia},
editor={Tinatin Margalitadze, George Meladze},
publisher={Ivane Javakhishvili Tbilisi University Press},
isbn={978-9941-13-542-2},
}
Download