Abstract |
With the rise of digital media in the last decades, many language-related discussions have found home on various fora and social media such as Facebook, where users can participate in a shared-interest group to discuss language use, problems and resources. The posts in these groups are formulated by language users as a genuine response to a specific disruption in language use and offer an empirical starting point for studying language problems. We propose an automatic approach to extracting questions from language-related Facebook groups and describe the procedure in consecutive steps. We also address the issues of copyright, privacy and ethical constraints, and propose ways to overcome them. We present the extraction method on a case of two Slovene language-related Facebook groups: Za vsaj približno pravilno rabo slovenščine and Društvo ljubiteljskih pravopisarjev in slovničarjev. Both groups allow users to discuss language-related problems and find answers to their questions within the community. Our first extraction from these groups yielded approximately 1,900 posts (written by approximately 500 users) and 13,000 comments (posted by more than 900 users), providing ample material that can be analyzed to reveal the users’ most frequent language problems. |
BibTex |
@InProceedings{ELX2018-005, author={Jaka Čibej, Špela Arhar Holdt}, title={Researching Dictionary Needs of Language Users Through Social Media: A Semi-Automatic Approach}, pages={67-76}, booktitle={Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts}, year={2018}, month={jul}, date={17-21}, address={Ljubljana, Slovenia}, editor={Jaka Čibej, Vojko Gorjanc, Iztok Kosem, Simon Krek}, publisher={Ljubljana University Press, Faculty of Arts}, isbn={978-961-06-0097-8}, } |