Comparative Analysis of Classification Models for Quality Assessment of Wikipedia Articles
Comparative Analysis of Classification Models for Quality Assessment of Wikipedia Articles - scientific work about Wikipedia quality published in 2017, written by Włodzimierz Lewoniewski, Krzysztof Węcel and Witold Abramowicz
In this paper authors compare the suitability of various classification models (including CART, random forest, boosting trees, C4.5, C5.0, SVM, neural networks) for automatic assessment of the quality of articles in seven language editions of Wikipedia (Belarussian, German, English, French, Polish, Russian, Ukrainian). Authors employed models available in STATISTICA, WEKA and R Studio. For the classification task authors used over 80 different features of the articles, elaborated based on state of the art analysis and our own experience. Authors also carried out a comparative analysis regarding the significance of the parameters having an impact on the quality of the papers in each language.