Abstract |
A major project of the Institute for Dutch Lexicology is the integrated Language Database of 8th-21st-Century Dutch (henceforth ILD). The ILD will consist of three components: a dictionary corpus component, a lexicon corpus component and a text corpus component. Its object is to facilitate synchronic and diachronic research on different aspects of the Dutch language and culture, and, in particular, to facilitate lexicological and lexicographical research. The three corpora will be interlinked and be made accessible by means of a retrieval system. To guarantee optimal retrieval facilities, extensive encoding is required of the entries in both the dictionary and the lexicon corpus, hi the text corpus, each text will have to be fully tagged with PoS and lemma. The texts will also need encoding ofthe text structure, the typography and some other textual elements. We will discuss the ILD encoding proposal for the text structure (3) and the typography (4) of the text corpus component. Our focus is on the guiding principles that have determined this proposal (2, 3.1 and 4.1). |
BibTex |
@InProceedings{ELX02-074, author = {Katrien Depuydt, Tilly Dutilh-Ruitenberg}, title = {TEI-encoding for the Integrated Language Database of 8th-21st-Century Dutch }, pages = {683-688}, booktitle = {Proceedings of the 10th EURALEX International Congress}, year = {2002}, month = {aug}, date = {13-17}, address = {København, Denmark}, editor = {Anna Braasch and Claus Povlsen}, publisher = {Center for Sprogteknologi}, isbn = {87-90708-09-1}, } |