Abstract |
We introduce COMO (Compositional Morphosyntactic Ontology), a classification of part-of-speech categories and their associated grammatical features, which aims to be valid across languages of very different typology. The work has been carried out within the context of the Oxford Global Languages programme, which has the goal of developing language knowledge for 100 languages, particularly those under-represented in the digital space. The requirements around this project are: to be able to describe languages of different typeS while respecting their grammatical tradition, and to be able to serve two main use cases that define our typical work, namely, the labelling of linguistic information in lexicographic products, and the provision of support for language processing tools and corpus annotation processes. These requirements determined the conception and design of COMO, created as a reference model within a broader data architecture in order to address issues of syntactic and semantic interoperability. Our proposal builds on top of previous initiatives in the field aiming at the same goals, but incorporates different features in order to accommodate for the requirements in the project. |
BibTex |
@InProceedings{ELX2018-014, author={Roser Saurí, Ashleigh Alderslade, Richard Shapiro}, title={A Universal Classification of Lexical Categories and Grammatical Distinctions for Lexicographic and Processing Purposes}, pages={173-185}, booktitle={Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts}, year={2018}, month={jul}, date={17-21}, address={Ljubljana, Slovenia}, editor={Jaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek}, publisher={Ljubljana University Press, Faculty of Arts}, isbn={978-961-06-0097-8}, } |