Towards the Automatic Generation of a Pattern-Based Dictionary of Spanish Verbs

Irene Renau; Rogelio Nazar; Daniel Mora Melanchthon

Towards the Automatic Generation of a Pattern-Based Dictionary of Spanish Verbs

By Iztok KosemDecember 19, 2024Euralex 2024, Publications

Page	367-383
Author	Irene Renau, Rogelio Nazar, Daniel Mora Melanchthon
Title	Towards the Automatic Generation of a Pattern-Based Dictionary of Spanish Verbs
Abstract	Corpus Pattern Analysis, CPA (Hanks, 2004a; 2004b; 2013), is a technique for identifying local semantic and syntactic information of a word and mapping it to its meanings. In verbs, it consists basically of the argument structure labelled with semantic types for each argument. CPA is used in several dictionary projects and allows systematic corpus analysis; however, it is extremely time-consuming. In this paper, we present a method for the automatic pattern identification of Spanish verbs in corpora. We used a syntactic parser for dependency analysis (Stanza), applied a named entity recognition (NER) tagger from the Flair NLP framework for NER and, for common nouns, we implemented a semantic tagger and a word sense disambiguation method, both created for the task. All resources were combined to extract CPA verb patterns. The method performs better than previous attempts and can contribute to a more efficient pattern-based lexicography.
Session	Talk
Keywords	argument structure; Corpus Pattern Analysis; pattern-based lexicography; semantic tagging; word sense disambiguation
BibTex	@inproceedings{euralex_2024_paper_29, address = {Cavtat}, title = {Towards the Automatic Generation of a Pattern-Based Dictionary of Spanish Verbs},isbn = {978-953-7967-77-2}, shorttitle = {Euralex 2024}, url = {}, language = {eng}, booktitle = {Lexicography and Semantics. Proceedings of the XXI EURALEX International Congress}, publisher = {Institut za hrvatski jezik}, author = {Renau, Irene and Nazar, Rogelio and Melanchthon, Daniel Mora}, editor = {Despot, Kristina Š. and Ostroški Anić, Ana and Brač, Ivana}, year = {2024}, pages = {367-383} }
Download

Towards the Automatic Generation of a Pattern-Based Dictionary of Spanish Verbs

Contact data

EURALEX address

EURALEX is supported by

Quick message