Learning to Detect Vandalism in Social Content Systems: a Study on Wikipedia

From Wikipedia Quality
Revision as of 09:05, 21 January 2020 by Maria (talk | contribs) (Adding embed)
Jump to: navigation, search


Learning to Detect Vandalism in Social Content Systems: a Study on Wikipedia
Authors
Sara Javanmardi
David W. McDonald
Rich Caruana
Sholeh Forouzan
Cristina Videira Lopes
Publication date
2013
DOI
10.1007/978-94-007-6359-3_11
Links
Original

Learning to Detect Vandalism in Social Content Systems: a Study on Wikipedia - scientific work related to Wikipedia quality published in 2013, written by Sara Javanmardi, David W. McDonald, Rich Caruana, Sholeh Forouzan and Cristina Videira Lopes.

Overview

A challenge facing user generated content systems is vandalism, i.e. edits that damage content quality. The high visibility and easy access to social networks makes them popular targets for vandals. Detecting and removing vandalism is critical for these user generated content systems. Because vandalism can take many forms, there are many different kinds of features that are potentially useful for detecting it. The complex nature of vandalism, and the large number of potential features, make vandalism detection difficult and time consuming for human editors. Machine learning techniques hold promise for developing accurate, tunable, and maintainable models that can be incorporated into vandalism detection tools. Authors describe a method for training classifiers for vandalism detection that yields classifiers that are more accurate on the PAN 2010 corpus than others previously developed. Because of the high turnaround in social network systems, it is important for vandalism detection tools to run in real-time. To this aim, authors use feature selection to find the minimal set of features consistent with high accuracy. In addition, because some features are more costly to compute than others, authors use cost-sensitive feature selection to reduce the total computational cost of executing models. In addition to the features previously used for spam detection, authors introduce new features based on user action histories. The user history features contribute significantly to classifier performance. The approach authors use is general and can easily be applied to other user generated content systems.

Embed

Wikipedia Quality

Javanmardi, Sara; McDonald, David W.; Caruana, Rich; Forouzan, Sholeh; Lopes, Cristina Videira. (2013). "[[Learning to Detect Vandalism in Social Content Systems: a Study on Wikipedia]]". Springer Netherlands. DOI: 10.1007/978-94-007-6359-3_11.

English Wikipedia

{{cite journal |last1=Javanmardi |first1=Sara |last2=McDonald |first2=David W. |last3=Caruana |first3=Rich |last4=Forouzan |first4=Sholeh |last5=Lopes |first5=Cristina Videira |title=Learning to Detect Vandalism in Social Content Systems: a Study on Wikipedia |date=2013 |doi=10.1007/978-94-007-6359-3_11 |url=https://wikipediaquality.com/wiki/Learning_to_Detect_Vandalism_in_Social_Content_Systems:_a_Study_on_Wikipedia |journal=Springer Netherlands}}

HTML

Javanmardi, Sara; McDonald, David W.; Caruana, Rich; Forouzan, Sholeh; Lopes, Cristina Videira. (2013). &quot;<a href="https://wikipediaquality.com/wiki/Learning_to_Detect_Vandalism_in_Social_Content_Systems:_a_Study_on_Wikipedia">Learning to Detect Vandalism in Social Content Systems: a Study on Wikipedia</a>&quot;. Springer Netherlands. DOI: 10.1007/978-94-007-6359-3_11.