Difference between revisions of "A Classifier to Determine Which Wikipedia Biographies Will Be Accepted"

From Wikipedia Quality
Jump to: navigation, search
(Wikilinks)
(Infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = A Classifier to Determine Which Wikipedia Biographies Will Be Accepted
 +
| date = 2015
 +
| authors = [[Lior Rokach]]
 +
| doi = 10.1002/asi.23199
 +
| link = http://onlinelibrary.wiley.com/doi/10.1002/asi.23199/abstract
 +
}}
 
'''A Classifier to Determine Which Wikipedia Biographies Will Be Accepted''' - scientific work related to [[Wikipedia quality]] published in 2015, written by [[Lior Rokach]].
 
'''A Classifier to Determine Which Wikipedia Biographies Will Be Accepted''' - scientific work related to [[Wikipedia quality]] published in 2015, written by [[Lior Rokach]].
  
 
== Overview ==
 
== Overview ==
 
Wikipedia, like other encyclopedias, includes biographies of notable people. However, because it is jointly written by many contributors, it is subject to constant manipulation by contributors attempting to add biographies of non-notable people. Over time, [[Wikipedia]] has developed inclusion criteria for notable people (e.g., receiving a significant award) based on which newly contributed biographies are evaluated. In this paper authors present and analyze a set of simple [[indicators]] that can be used to predict which article will eventually be accepted. These indicators do not refer to the content itself, but to meta-content [[features]] (such as the number of [[categories]] that the biography is associated with) and to author-based features (such as if it is a first-time author). By training a classifier on these features, authors successfully reached a high predictive performance (area under the receiver operating characteristic [ROC] curve [AUC] of 0.97) even though authors overlooked the actual biography text.
 
Wikipedia, like other encyclopedias, includes biographies of notable people. However, because it is jointly written by many contributors, it is subject to constant manipulation by contributors attempting to add biographies of non-notable people. Over time, [[Wikipedia]] has developed inclusion criteria for notable people (e.g., receiving a significant award) based on which newly contributed biographies are evaluated. In this paper authors present and analyze a set of simple [[indicators]] that can be used to predict which article will eventually be accepted. These indicators do not refer to the content itself, but to meta-content [[features]] (such as the number of [[categories]] that the biography is associated with) and to author-based features (such as if it is a first-time author). By training a classifier on these features, authors successfully reached a high predictive performance (area under the receiver operating characteristic [ROC] curve [AUC] of 0.97) even though authors overlooked the actual biography text.

Revision as of 07:48, 21 September 2020


A Classifier to Determine Which Wikipedia Biographies Will Be Accepted
Authors
Lior Rokach
Publication date
2015
DOI
10.1002/asi.23199
Links
Original

A Classifier to Determine Which Wikipedia Biographies Will Be Accepted - scientific work related to Wikipedia quality published in 2015, written by Lior Rokach.

Overview

Wikipedia, like other encyclopedias, includes biographies of notable people. However, because it is jointly written by many contributors, it is subject to constant manipulation by contributors attempting to add biographies of non-notable people. Over time, Wikipedia has developed inclusion criteria for notable people (e.g., receiving a significant award) based on which newly contributed biographies are evaluated. In this paper authors present and analyze a set of simple indicators that can be used to predict which article will eventually be accepted. These indicators do not refer to the content itself, but to meta-content features (such as the number of categories that the biography is associated with) and to author-based features (such as if it is a first-time author). By training a classifier on these features, authors successfully reached a high predictive performance (area under the receiver operating characteristic [ROC] curve [AUC] of 0.97) even though authors overlooked the actual biography text.