Wikipedia Vandal Early Detection: from User Behavior to User Embedding
Wikipedia Vandal Early Detection: from User Behavior to User Embedding - scientific work related to Wikipedia quality published in 2017, written by Shuhan Yuan, Panpan Zheng, Xintao Wu and Yang Xiang.
Wikipedia is the largest online encyclopedia that allows anyone to edit articles. In this paper, authors propose the use of deep learning to detect vandals based on their edit history. In particular, authors develop a multi-source long-short term memory network (M-LSTM) to model user behaviors by using a variety of user edit aspects as inputs, including the history of edit reversion information, edit page titles and categories. With M-LSTM, authors can encode each user into a low dimensional real vector, called user embedding. Meanwhile, as a sequential model, M-LSTM updates the user embedding each time after the user commits a new edit. Thus, authors can predict whether a user is benign or vandal dynamically based on the up-to-date user embedding. Furthermore, those user embeddings are crucial to discover collaborative vandals. Code and data related to this chapter are available at: https://bitbucket.org/bookcold/vandal_detection.