Token Level Code-Switching Detection Using Wikipedia as a Lexical Resource

From Wikipedia Quality
Revision as of 00:42, 27 October 2019 by Eliana (talk | contribs) (infobox)
Jump to: navigation, search


Token Level Code-Switching Detection Using Wikipedia as a Lexical Resource
Authors
Daniel Claeser
Dennis Felske
Samantha Kent
Publication date
2017
DOI
10.1007/978-3-319-73706-5_16
Links
Original

Token Level Code-Switching Detection Using Wikipedia as a Lexical Resource - scientific work related to Wikipedia quality published in 2017, written by Daniel Claeser, Dennis Felske and Samantha Kent.

Overview

Authors present a novel lexicon-based classification approach for code-switching detection on Twitter. The main aim is to develop a simple lexical look-up classifier based on frequency information retrieved from Wikipedia. Authors evaluate the classifier using three different language pairs: Spanish-English, Dutch-English, and German-Turkish. The results indicate that figures for Spanish-English are competitive with current state of the art classifiers, even though the approach is simplistic and based solely on word frequency information.