Difference between revisions of "Robust Clustering of Languages Across Wikipedia Growth"

From Wikipedia Quality
Jump to: navigation, search
(Overview - Robust Clustering of Languages Across Wikipedia Growth)
 
(Wikilinks)
Line 1: Line 1:
'''Robust Clustering of Languages Across Wikipedia Growth''' - scientific work related to Wikipedia quality published in 2017, written by Kristina Ban, Matjaz Perc and Zoran Levnajic.
+
'''Robust Clustering of Languages Across Wikipedia Growth''' - scientific work related to [[Wikipedia quality]] published in 2017, written by [[Kristina Ban]], [[Matjaz Perc]] and [[Zoran Levnajic]].
  
 
== Overview ==
 
== Overview ==
Wikipedia is the largest existing knowledge repository that is growing on a genuine crowdsourcing support. While the English Wikipedia is the most extensive and the most researched one with over 5 million articles, comparatively little is known about the behaviour and growth of the remaining 283 smaller Wikipedias, the smallest of which, Afar, has only one article. Here, authors use a subset of these data, consisting of 14 962 different articles, each of which exists in 26 different languages, from Arabic to Ukrainian. Authors study the growth of Wikipedias in these languages over a time span of 15 years. Authors show that, while an average article follows a random path from one language to another, there exist six well-defined clusters of Wikipedias that share common growth patterns. The make-up of these clusters is remarkably robust against the method used for their determination, as authors verify via four different clustering methods. Interestingly, the identified Wikipedia clusters have little correlation with language families and groups. Rather, the growth of Wikipedia across different languages is governed by different factors, ranging from similarities in culture to information literacy.
+
Wikipedia is the largest existing knowledge repository that is growing on a genuine crowdsourcing support. While the [[English Wikipedia]] is the most extensive and the most researched one with over 5 million articles, comparatively little is known about the behaviour and growth of the remaining 283 smaller [[Wikipedia]]s, the smallest of which, Afar, has only one article. Here, authors use a subset of these data, consisting of 14 962 different articles, each of which exists in 26 [[different language]]s, from Arabic to Ukrainian. Authors study the growth of Wikipedias in these languages over a time span of 15 years. Authors show that, while an average article follows a random path from one language to another, there exist six well-defined clusters of Wikipedias that share common growth patterns. The make-up of these clusters is remarkably robust against the method used for their determination, as authors verify via four different clustering methods. Interestingly, the identified Wikipedia clusters have little correlation with language families and groups. Rather, the growth of Wikipedia across different languages is governed by different factors, ranging from similarities in culture to information literacy.

Revision as of 08:43, 20 May 2020

Robust Clustering of Languages Across Wikipedia Growth - scientific work related to Wikipedia quality published in 2017, written by Kristina Ban, Matjaz Perc and Zoran Levnajic.

Overview

Wikipedia is the largest existing knowledge repository that is growing on a genuine crowdsourcing support. While the English Wikipedia is the most extensive and the most researched one with over 5 million articles, comparatively little is known about the behaviour and growth of the remaining 283 smaller Wikipedias, the smallest of which, Afar, has only one article. Here, authors use a subset of these data, consisting of 14 962 different articles, each of which exists in 26 different languages, from Arabic to Ukrainian. Authors study the growth of Wikipedias in these languages over a time span of 15 years. Authors show that, while an average article follows a random path from one language to another, there exist six well-defined clusters of Wikipedias that share common growth patterns. The make-up of these clusters is remarkably robust against the method used for their determination, as authors verify via four different clustering methods. Interestingly, the identified Wikipedia clusters have little correlation with language families and groups. Rather, the growth of Wikipedia across different languages is governed by different factors, ranging from similarities in culture to information literacy.