Cross-Language Learning from Bots and Users to Detect Vandalism on Wikipedia

From Wikipedia Quality
Jump to: navigation, search


Cross-Language Learning from Bots and Users to Detect Vandalism on Wikipedia
Authors
Khoi-Nguyen Tran
Peter Christen
Publication date
2015
DOI
10.1109/TKDE.2014.2339844
Links
Original

Cross-Language Learning from Bots and Users to Detect Vandalism on Wikipedia - scientific work related to Wikipedia quality published in 2015, written by Khoi-Nguyen Tran and Peter Christen.

Overview

Vandalism, the malicious modification of articles, is a serious problem for open access encyclopedias such as Wikipedia. The use of counter-vandalism bots is changing the way Wikipedia identifies and bans vandals, but their contributions are often not considered nor discussed. In this paper, authors propose novel text features capturing the invariants of vandalism across five languages to learn and compare the contributions of bots and users in the task of identifying vandalism. Authors construct computationally efficient features that highlight the contributions of bots and users, and generalize across languages. Authors evaluate proposed features through classification performance on revisions of five Wikipedia languages, totaling over 500 million revisions of over nine million articles. As a comparison, authors evaluate these features on the small PAN Wikipedia vandalism data sets, used by previous research, which contain approximately 62,000 revisions. Authors show differences in the performance of features on the PAN and the full Wikipedia data set. With the appropriate text features, vandalism bots can be effective across different languages while learning from only one language. Authors ultimate aim is to build the next generation of vandalism detection bots based on machine learning approaches that can work effectively across many languages.

Embed

Wikipedia Quality

Tran, Khoi-Nguyen; Christen, Peter. (2015). "[[Cross-Language Learning from Bots and Users to Detect Vandalism on Wikipedia]]".DOI: 10.1109/TKDE.2014.2339844.

English Wikipedia

{{cite journal |last1=Tran |first1=Khoi-Nguyen |last2=Christen |first2=Peter |title=Cross-Language Learning from Bots and Users to Detect Vandalism on Wikipedia |date=2015 |doi=10.1109/TKDE.2014.2339844 |url=https://wikipediaquality.com/wiki/Cross-Language_Learning_from_Bots_and_Users_to_Detect_Vandalism_on_Wikipedia}}

HTML

Tran, Khoi-Nguyen; Christen, Peter. (2015). &quot;<a href="https://wikipediaquality.com/wiki/Cross-Language_Learning_from_Bots_and_Users_to_Detect_Vandalism_on_Wikipedia">Cross-Language Learning from Bots and Users to Detect Vandalism on Wikipedia</a>&quot;.DOI: 10.1109/TKDE.2014.2339844.