Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata?

From Wikipedia Quality
Revision as of 09:27, 25 January 2020 by Jasmine (talk | contribs) (Infobox)
Jump to: navigation, search


Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata?
Authors
Andrew G. West
Sampath Kannan
Insup Lee
Publication date
2010
DOI
10.1145/1752046.1752050
Links
Original

Detecting Wikipedia Vandalism via Spatio-Temporal Analysis of Revision Metadata? - scientific work related to Wikipedia quality published in 2010, written by Andrew G. West, Sampath Kannan and Insup Lee.

Overview

Blatantly unproductive edits undermine the quality of the collaboratively-edited encyclopedia, Wikipedia. They not only disseminate dishonest and offensive content, but force editors to waste time undoing such acts of vandalism . Language-processing has been applied to combat these malicious edits, but as with email spam, these filters are evadable and computationally complex. Meanwhile, recent research has shown spatial and temporal features effective in mitigating email spam, while being lightweight and robust. In this paper, authors leverage the spatio-temporal properties of revision metadata to detect vandalism on Wikipedia. An administrative form of reversion called rollback enables the tagging of malicious edits, which are contrasted with non-offending edits in numerous dimensions. Crucially, none of these features require inspection of the article or revision text. Ultimately, a classifier is produced which flags vandalism at performance comparable to the natural-language efforts authors intend to complement (85% accuracy at 50% recall). The classifier is scalable (processing 100+ edits a second) and has been used to locate over 5,000 manually-confirmed incidents of vandalism outside labeled set.