Comparing Orthographies in Space and Time through Lexicographic Resources

Page 159-172
Author Christian-Emil Smith Ore, Oddrun Grønvik
Title Comparing Orthographies in Space and Time through Lexicographic Resources
Abstract Many languages require an improved factual basis to facilitate computer-supported analysis of language variation and diachronic change. The material collections for the scholarly dictionaries of Norway serve as a platform for exploring the development and variation of Bokmål, the Norwegian written standard derived from Danish and modified towards the Norwegian vernacular through orthographic reforms that took place from 1901 to 2005. The development of modern Bokmål through usage should be analyzed by comparing corpora from different periods, lemmatized according to the then current orthography. This means building full form registers from time-bound orthographies. This plan is in process through digitizing orthographic dictionaries for Bokmål. The dictionaries are coordinated through the Dictionary Hotel, the electronic repository for retro digitized dictionaries and dialect collections at the Norwegian Language Collections, Bergen. At the lexical item level Bokmål and Nynorsk resources are coordinated through the Meta Dictionary, an electronic registry for the Norwegian lexicon. A common entry requires full identity in one headword form plus part-of-speech (POS). Preliminary results identify a core vocabulary for Bokmål of 6,900 lexical items, unchanged since 1938. More than 75,000 Meta Dictionary entries have a common identical form plus POS for Bokmål and Nynorsk. These numbers will increase when the Bokmål additions to the Meta Dictionary are quality controlled.
Session DICTIONARY-MAKING PROCESS
Keywords dictionary, lexical item, full form register, computer assisted language analysis, corpus, lemmatizer, synchronic variation, diachronic change
BibTex
@InProceedings{ELX2018-013,
author={Christian-Emil Smith Ore, Oddrun Grønvik},
title={Comparing Orthographies in Space and Time through Lexicographic Resources},
pages={159-172},
booktitle={Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts},
year={2018},
month={jul},
date={17-21},
address={Ljubljana, Slovenia},
editor={Jaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek},
publisher={Ljubljana University Press, Faculty of Arts},
isbn={978-961-06-0097-8}, }
Download