Difference between revisions of "Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011"
(wikilinks) |
(+ categories) |
||
(2 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
+ | {{Infobox work | ||
+ | | title = Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011 | ||
+ | | date = 2011 | ||
+ | | authors = [[Cristian-Alexandru Dragusanu]]<br />[[Marina Cufliuc]]<br />[[Adrian Iftene]] | ||
+ | | link = http://ceur-ws.org/Vol-1177/CLEF2011wn-PAN-DragusanuEt2011.pdf | ||
+ | }} | ||
'''Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011''' - scientific work related to [[Wikipedia quality]] published in 2011, written by [[Cristian-Alexandru Dragusanu]], [[Marina Cufliuc]] and [[Adrian Iftene]]. | '''Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011''' - scientific work related to [[Wikipedia quality]] published in 2011, written by [[Cristian-Alexandru Dragusanu]], [[Marina Cufliuc]] and [[Adrian Iftene]]. | ||
== Overview == | == Overview == | ||
Wikipedia vandalism identification is a very complex issue, which is now mostly solved manually by volunteers. This paper presents the main components of a system built by group in order to automatically identify vandalized [[Wikipedia]] articles. The main component of system is a machine learning component that uses three types of [[features]] grouped in 3 classes: Metadata, Text and Language. Additional to previous approaches authors consider 4 new features related to vulgar, biased, sexual and miscellaneous bad words. The obtained results showed an area of 0.42464 under the PR-AUC curve and an area of 0.82963 under the ROC-AUC curve. | Wikipedia vandalism identification is a very complex issue, which is now mostly solved manually by volunteers. This paper presents the main components of a system built by group in order to automatically identify vandalized [[Wikipedia]] articles. The main component of system is a machine learning component that uses three types of [[features]] grouped in 3 classes: Metadata, Text and Language. Additional to previous approaches authors consider 4 new features related to vulgar, biased, sexual and miscellaneous bad words. The obtained results showed an area of 0.42464 under the PR-AUC curve and an area of 0.82963 under the ROC-AUC curve. | ||
+ | |||
+ | == Embed == | ||
+ | === Wikipedia Quality === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Dragusanu, Cristian-Alexandru; Cufliuc, Marina; Iftene, Adrian. (2011). "[[Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011]]". | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === English Wikipedia === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | {{cite journal |last1=Dragusanu |first1=Cristian-Alexandru |last2=Cufliuc |first2=Marina |last3=Iftene |first3=Adrian |title=Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011 |date=2011 |url=https://wikipediaquality.com/wiki/Detecting_Wikipedia_Vandalism_Using_Machine_Learning_-_Notebook_for_Pan_at_Clef_2011}} | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === HTML === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Dragusanu, Cristian-Alexandru; Cufliuc, Marina; Iftene, Adrian. (2011). &quot;<a href="https://wikipediaquality.com/wiki/Detecting_Wikipedia_Vandalism_Using_Machine_Learning_-_Notebook_for_Pan_at_Clef_2011">Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011</a>&quot;. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | |||
+ | |||
+ | [[Category:Scientific works]] |
Latest revision as of 13:07, 2 November 2020
Authors | Cristian-Alexandru Dragusanu Marina Cufliuc Adrian Iftene |
---|---|
Publication date | 2011 |
Links | Original |
Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011 - scientific work related to Wikipedia quality published in 2011, written by Cristian-Alexandru Dragusanu, Marina Cufliuc and Adrian Iftene.
Overview
Wikipedia vandalism identification is a very complex issue, which is now mostly solved manually by volunteers. This paper presents the main components of a system built by group in order to automatically identify vandalized Wikipedia articles. The main component of system is a machine learning component that uses three types of features grouped in 3 classes: Metadata, Text and Language. Additional to previous approaches authors consider 4 new features related to vulgar, biased, sexual and miscellaneous bad words. The obtained results showed an area of 0.42464 under the PR-AUC curve and an area of 0.82963 under the ROC-AUC curve.
Embed
Wikipedia Quality
Dragusanu, Cristian-Alexandru; Cufliuc, Marina; Iftene, Adrian. (2011). "[[Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011]]".
English Wikipedia
{{cite journal |last1=Dragusanu |first1=Cristian-Alexandru |last2=Cufliuc |first2=Marina |last3=Iftene |first3=Adrian |title=Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011 |date=2011 |url=https://wikipediaquality.com/wiki/Detecting_Wikipedia_Vandalism_Using_Machine_Learning_-_Notebook_for_Pan_at_Clef_2011}}
HTML
Dragusanu, Cristian-Alexandru; Cufliuc, Marina; Iftene, Adrian. (2011). "<a href="https://wikipediaquality.com/wiki/Detecting_Wikipedia_Vandalism_Using_Machine_Learning_-_Notebook_for_Pan_at_Clef_2011">Detecting Wikipedia Vandalism Using Machine Learning - Notebook for Pan at Clef 2011</a>".