Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles

From Wikipedia Quality
Jump to: navigation, search

Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles
Włodzimierz Lewoniewski
Krzysztof Węcel
Witold Abramowicz
Publication date
Original Preprint

Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles - scientific work about Wikipedia quality published in 2017, written by Włodzimierz Lewoniewski, Krzysztof Węcel and Witold Abramowicz.


Despite the fact that Wikipedia is often criticized for its poor quality, it continues to be one of the most popular knowledge bases in the world. Articles in this free encyclopedia on various topics can be created and edited in about 300 different language versions independently. Our research has showed that in language sensitive topics, the quality of information can be relatively better in the relevant language versions. However, in most cases, it is difficult for the Wikipedia readers to determine the language affiliation of the described subject. Additionally, each language edition of Wikipedia can have own rules in the manual assessing of the content’s quality. There are also differences in grading schemes between language versions: some use a 6–8 grade system to assess articles, and some are limited to 2–3. This makes automatic quality comparison of articles between various languages a challenging task, particularly if we take into account a large number of unassessed articles; some of the Wikipedia language editions have over 99% of articles without a quality grade. The paper presents the results of a relative quality and popularity assessment of over 28 million articles in 44 selected language versions. Comparative analysis of the quality and the popularity of articles in popular topics was also conducted. Additionally, the correlation between quality and popularity of Wikipedia articles of selected topics in various languages was investigated. The proposed method allows us to find articles with information of better quality that can be used to automatically enrich other language editions of Wikipedia.