Language of Vandalism: Improving Wikipedia Vandalism Detection via Stylometric Analysis

From Wikipedia Quality
Jump to: navigation, search


Language of Vandalism: Improving Wikipedia Vandalism Detection via Stylometric Analysis
Authors
Manoj Harpalani
Michael Hart
Sandesh Singh
Rob Johnson
Yejin Choi
Publication date
2011
Links
Original

Language of Vandalism: Improving Wikipedia Vandalism Detection via Stylometric Analysis - scientific work related to Wikipedia quality published in 2011, written by Manoj Harpalani, Michael Hart, Sandesh Singh, Rob Johnson and Yejin Choi.

Overview

Community-based knowledge forums, such as Wikipedia, are susceptible to vandalism, i.e., ill-intentioned contributions that are detrimental to the quality of collective intelligence. Most previous work to date relies on shallow lexico-syntactic patterns and metadata to automatically detect vandalism in Wikipedia. In this paper, authors explore more linguistically motivated approaches to vandalism detection. In particular, authors hypothesize that textual vandalism constitutes a unique genre where a group of people share a similar linguistic behavior. Experimental results suggest that (1) statistical models give evidence to unique language styles in vandalism, and that (2) deep syntactic patterns based on probabilistic context free grammars (PCFG) discriminate vandalism more effectively than shallow lexico-syntactic patterns based on n-grams.

Embed

Wikipedia Quality

Harpalani, Manoj; Hart, Michael; Singh, Sandesh; Johnson, Rob; Choi, Yejin. (2011). "[[Language of Vandalism: Improving Wikipedia Vandalism Detection via Stylometric Analysis]]". Association for Computational Linguistics.

English Wikipedia

{{cite journal |last1=Harpalani |first1=Manoj |last2=Hart |first2=Michael |last3=Singh |first3=Sandesh |last4=Johnson |first4=Rob |last5=Choi |first5=Yejin |title=Language of Vandalism: Improving Wikipedia Vandalism Detection via Stylometric Analysis |date=2011 |url=https://wikipediaquality.com/wiki/Language_of_Vandalism:_Improving_Wikipedia_Vandalism_Detection_via_Stylometric_Analysis |journal=Association for Computational Linguistics}}

HTML

Harpalani, Manoj; Hart, Michael; Singh, Sandesh; Johnson, Rob; Choi, Yejin. (2011). &quot;<a href="https://wikipediaquality.com/wiki/Language_of_Vandalism:_Improving_Wikipedia_Vandalism_Detection_via_Stylometric_Analysis">Language of Vandalism: Improving Wikipedia Vandalism Detection via Stylometric Analysis</a>&quot;. Association for Computational Linguistics.