Difference between revisions of "Wikipedia Quality"

From Wikipedia Quality
Jump to: navigation, search
(Header)
(21 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 
<languages />
 
<languages />
 +
<translate>
 +
<!--T:1-->
 
<div id="mp-topbanner" style="clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000; white-space:nowrap;">
 
<div id="mp-topbanner" style="clear:both; position:relative; box-sizing:border-box; width:100%; margin:1.2em 0 6px; min-width:47em; border:1px solid #ddd; background-color:#f9f9f9; color:#000; white-space:nowrap;">
 
<div style="margin:0.4em; text-align:center;">
 
<div style="margin:0.4em; text-align:center;">
 
<div style="font-size:170%; padding:.1em; text-align:center;">Welcome to [[Wikipedia Quality]],</div>
 
<div style="font-size:170%; padding:.1em; text-align:center;">Welcome to [[Wikipedia Quality]],</div>
<div style="font-size:100%; text-align:center;">portal about concepts, researches and services related to quality assessment of the Multilingual [[Wikipedia]].</div>
+
<div style="font-size:100%; text-align:center;">portal about concepts, researches and services related to quality assessment of the Multilingual [[Wikipedia]].</div></div>
 
 
</div>
 
 
<div style="position:absolute; right:1em; top:10%; width:38%; min-width:25em; font-size:95%;">
 
<div style="position:absolute; right:1em; top:10%; width:38%; min-width:25em; font-size:95%;">
 
<div id="articlecount" style="font-size:100%; text-align:right;">Articles count: [[Special:Statistics|{{NUMBEROFARTICLES}}]]</div>
 
<div id="articlecount" style="font-size:100%; text-align:right;">Articles count: [[Special:Statistics|{{NUMBEROFARTICLES}}]]</div>
 
</div>
 
</div>
 
</div>
 
</div>
 +
{| role="presentation" id="mp-upper" style="width: 100%; margin-top:4px; border-spacing: 0px;"
 +
| id="mp-left" class="MainPageBG" style="width:55%; border:1px solid #cef2e0; padding:0; background:#f5fffa; vertical-align:top; color:#000;" |
 +
<h2 id="mp-dyk-h2" style="clear:both; margin:0.5em; background:#cef2e0; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #a3bfb1; color:#000; padding:0.2em 0.4em;">Overview</h2>
 +
<div id="mp-dyk" style="padding:0.1em 0.6em 0.5em;">
 +
Despite the fact that [[Wikipedia]] is often criticized for its poor quality, it still is one of the most popular knowledge bases in the world. Currently, this online encyclopedia is on the 5th place in the ranking of [[Popular websites|most visited sites]] (after [[Google]], [[Youtube]], [[Facebook]], [[Baidu]]). Articles in this encyclopedia are created and edited in over 300 different languages. Currently Wikipedia contains more than 50 million articles about various topics and [[List of Wikipedias|languages]].
 +
Every day the number of articles in Wikipedia is growing. They can be created and edited even by anonymous users. Authors do not need to formally demonstrate their skills, education and experience in certain areas. Wikipedia does not have a central editorial team or a group of reviewers who could comprehensively check all new and existing texts. For these and other reasons, people often [[Criticism of Wikipedia|criticize]] the concept of Wikipedia, in particular pointing out the poor quality of information.
 +
</div>
 +
| style="border:1px solid transparent;" |
 +
| id="mp-right" class="MainPageBG" style="width:45%; border:1px solid #cedff2; padding:0; background:#f5faff; vertical-align:top;"|
 +
<h2 id="mp-itn-h2" style="margin:0.5em; background:#cedff2; font-family:inherit; font-size:120%; font-weight:bold; border:1px solid #a3b0bf; color:#000; padding:0.2em 0.4em;">Selected works</h2>
 +
<div id="mp-itn" style="padding:0.1em 0.6em;">{{#time:F j, Y}}</div>
 +
{{:Wikipedia_Quality/List}}
 +
|}
  
 +
<!--T:2-->
 
[[File:Map-scientists.png|300px|thumb|right|[[List of countries|Number of scientists]] in each country who conduct research on the Wikipedia Quality]]
 
[[File:Map-scientists.png|300px|thumb|right|[[List of countries|Number of scientists]] in each country who conduct research on the Wikipedia Quality]]
<translate>
 
<!--T:1-->
 
Despite the fact that </translate>[[Wikipedia]] is often criticized for its poor quality, it still is one of the most popular knowledge bases in the world. Currently, this online encyclopedia is on the 5th place in the ranking of [[Popular websites|most visited sites]] (after [[Google]], [[Youtube]], [[Facebook]], [[Baidu]]). Articles in this encyclopedia are created and edited in about 300 different languages. Currently Wikipedia contains more than 48 million articles about various topics and [[List of Wikipedias|languages]].
 
  
Every day the number of articles in Wikipedia is growing. They can be created and edited even by anonymous users. Authors do not need to formally demonstrate their skills, education and experience in certain areas. Wikipedia does not have a central editorial team or a group of reviewers who could comprehensively check all new and existing texts. For these and other reasons, people often [[Criticism of Wikipedia|criticize]] the concept of Wikipedia, in particular pointing out the poor quality of information.
 
  
 +
<!--T:5-->
 
Despite this, in Wikipedia you can sometimes find valuable information – depending on the language version and subject. Practically in every language version there is a system of awards for the best articles. However, the number of these articles is relatively small (less than one percent). In some language versions, there are also other quality grades. However, the overwhelming majority of articles have are unevaluated (in some languages more than 99%).
 
Despite this, in Wikipedia you can sometimes find valuable information – depending on the language version and subject. Practically in every language version there is a system of awards for the best articles. However, the number of these articles is relatively small (less than one percent). In some language versions, there are also other quality grades. However, the overwhelming majority of articles have are unevaluated (in some languages more than 99%).
  
== Automatic quality assessment of Wikipedia articles ==
+
== Automatic quality assessment of Wikipedia articles == <!--T:6-->
So, in Wikipedia, many articles do not have [[quality grade|quality grades]], so each reader should manually analyze their content. Automatic quality assessment of Wikipedia articles is known and wide area in the scientific world - researchers from over [[List of countries|50 countries]] published various works related to quality of Wikipedia. Basically, the scientific works describes the most developed language version of Wikipedia – English, which already contains more than 5.5 million articles.
+
So, in Wikipedia, many articles do not have [[quality grade|quality grades]], so each reader should manually analyze their content. Automatic quality assessment of Wikipedia articles is known and wide area in the scientific world - researchers from over [[List of countries|50 countries]] published various works related to quality of Wikipedia. Basically, the scientific works describes the most developed language version of Wikipedia – English, which already contains more than 5.7 million articles.
  
[[File:Logosoc.jpg|200px|thumb|right|Wikipedia Quality]]
+
<!--T:7-->
 
Since it foundation and with the growing popularity of Wikipedia, more and more scientific publications on this subject have published. One of the first studies showed that measuring the volume of content can help determine the degree of “maturity” of the Wikipedia article. Works in this direction shows that, in general, higher-quality articles are long, use many references, are edited by hundreds of authors and have thousands of editions.
 
Since it foundation and with the growing popularity of Wikipedia, more and more scientific publications on this subject have published. One of the first studies showed that measuring the volume of content can help determine the degree of “maturity” of the Wikipedia article. Works in this direction shows that, in general, higher-quality articles are long, use many references, are edited by hundreds of authors and have thousands of editions.
  
 +
[[File:Dimensions.png|350px|thumb|right|[[Quality dimensions]] of the traditional [[encyclopedia|encyclopedias]], [[Wikipedia]], [[Web 2.0]]<ref name="art100" />]] There are different measures related to such [[quality dimensions]] as credibility, completeness, objectivity, readability, relevance, style and timeliness.<ref name="art100" />
 +
 +
<!--T:8-->
 
'''How do they come to such conclusions?''' Simply put: comparing good and bad articles.
 
'''How do they come to such conclusions?''' Simply put: comparing good and bad articles.
  
 +
<!--T:9-->
 
As already mentioned earlier, in almost every language version of Wikipedia, there is a system of assessing the quality of articles. The best articles are awarded in a special way – they receive a special “badge”. In Russian Wikipedia such articles are called “Featured Articles” (FA). There is another “badge” for articles that slightly below the best ones – “Good articles” (GA). In some language versions, there are other estimates for more “weak” articles. For example, in English Wikipedia there are also: A-class, B-class, C-class, Start, Stub. On the other hand in Russian Wikipedia we can met the following additional grades: Solid, Full, Developed, In development, Stub.
 
As already mentioned earlier, in almost every language version of Wikipedia, there is a system of assessing the quality of articles. The best articles are awarded in a special way – they receive a special “badge”. In Russian Wikipedia such articles are called “Featured Articles” (FA). There is another “badge” for articles that slightly below the best ones – “Good articles” (GA). In some language versions, there are other estimates for more “weak” articles. For example, in English Wikipedia there are also: A-class, B-class, C-class, Start, Stub. On the other hand in Russian Wikipedia we can met the following additional grades: Solid, Full, Developed, In development, Stub.
  
 +
<!--T:10-->
 
Even on the example of the English and Russian versions, we can conclude that the standards for the grading scale are different and depends on the language. Moreover, not all language versions of Wikipedia have such a developed system of quality assessment of articles. For example, German Wikipedia, which contains more than 2 million articles, uses only two estimates – equivalents for FA and GA. Therefore, often assessments in scientific papers are grouped into two groups:<ref name="art1" /><ref name="art2" /><ref name="art3" /><ref name="art4" /><ref name="art20" /><ref name="art5" /><ref name="art6" /><ref name="art7" />
 
Even on the example of the English and Russian versions, we can conclude that the standards for the grading scale are different and depends on the language. Moreover, not all language versions of Wikipedia have such a developed system of quality assessment of articles. For example, German Wikipedia, which contains more than 2 million articles, uses only two estimates – equivalents for FA and GA. Therefore, often assessments in scientific papers are grouped into two groups:<ref name="art1" /><ref name="art2" /><ref name="art3" /><ref name="art4" /><ref name="art20" /><ref name="art5" /><ref name="art6" /><ref name="art7" />
  
 +
<!--T:11-->
 
* ”Complete” – FA and GA grade,
 
* ”Complete” – FA and GA grade,
 
* ”Incomplete” – all other grades
 
* ”Incomplete” – all other grades
  
 +
<!--T:12-->
 
Let’s call this method “binary” (1 – Complete articles, 0 – Incomplete articles). This separation naturally “blurs” the boundaries between individual classes, but it allows you to build and compare quality models for different language versions of Wikipedia.
 
Let’s call this method “binary” (1 – Complete articles, 0 – Incomplete articles). This separation naturally “blurs” the boundaries between individual classes, but it allows you to build and compare quality models for different language versions of Wikipedia.
  
== Data Mining ==  
+
== Data Mining == <!--T:13-->
To build such models, you can use various algorithms, in particular [[Data mining|Data Mining]]. One of the most commonly used algorithms – [[Random forest|Random Forest]]<ref name="art1" /><ref name="art2" /><ref name="art3" /><ref name="art4" /><ref name="art5" /><ref name="art6" /><ref name="art7" /><ref name="art20" />. There are even studies<ref name="art4" />, which compare it with other algorithms (CART, SMO, Multilayer Perceptron, LMT, C4.5, C5.0 and others). Random Forest allows to build models even using variables that correlates with each other. Additionally, this algorithm can show which variables are more important for determining the quality of articles. If we need to get other information about the importance of variables, we can use other algorithms, including logistic regression.<ref name="art13" />
+
To build such models, you can use various algorithms, in particular [[Data mining|Data Mining]]. One of the most commonly used algorithms – [[Random forest|Random Forest]]<ref name="art1" /><ref name="art2" /><ref name="art3" /><ref name="art4" /><ref name="art5" /><ref name="art6" /><ref name="art7" /><ref name="art20" />. There are even studies<ref name="art4" />, which compare it with other algorithms ([[CART]], SMO, Multilayer Perceptron, LMT, C4.5, C5.0 and others). Random Forest allows to build models even using variables that correlates with each other. Additionally, this algorithm can show which variables are more important for determining the quality of articles. If we need to get other information about the importance of variables, we can use other algorithms, including logistic regression.<ref name="art13" />  
  
 +
<!--T:14-->
 
The results show that there are differences between article quality models in different language versions of Wikipedia.<ref name="art1" /><ref name="art2" /><ref name="art3" /><ref name="art4" /> So, if in one language version one of the most important parameters is the number of references (sources), in another language will be more important the number of images and the length of the text.
 
The results show that there are differences between article quality models in different language versions of Wikipedia.<ref name="art1" /><ref name="art2" /><ref name="art3" /><ref name="art4" /> So, if in one language version one of the most important parameters is the number of references (sources), in another language will be more important the number of images and the length of the text.
  
In this case, the quality is modeled as the probability of referring an article to one of two groups – Complete or Incomplete. The conclusion is made on the basis of analysis of various parameters (metrics): the length of the text<ref>Blumenstock, J.E.: Automatically Assessing the Quality of Wikipedia Articles. Tech. rep. (2008)</ref><ref>Conti, R., Marzini, E., Spognardi, A., Matteucci, I., Mori, P., Petrocchi, M.: [[Maturity Assessment of Wikipedia Medical Articles]]. In: Computer-Based Medical Systems (CBMS), 2014 IEEE 27th International Symposium on. pp. 281-286. IEEE (2014)</ref><ref>Yaari, E., Baruchson-Arbib, S., Bar-Ilan, J.: [[Information Quality Assessment of Community Generated Content: A User Study of Wikipedia]]. Journal of Information Science 37(5), 487-498 (2011)</ref><ref>Dang, Q.V., Ignat, C.L.: [[Measuring Quality of Collaboratively Edited Documents: The Case of Wikipedia]]. In: Collaboration and Internet Computing (CIC), 2016 IEEE 2nd International Conference on. pp. 266-275. IEEE (2016)</ref><ref>Shen, A., Qi, J., Baldwin, T.: A hybrid model for quality assessment of wikipedia articles. In: Proceedings of the Australasian Language Technology Association Workshop 2017. pp. 43-52 (2017)</ref><ref>Zhang, S., Hu, Z., Zhang, C., Yu, K.: History-based article quality assessment on wikipedia. In: Big Data and Smart Computing (BigComp), 2018 IEEE International Conference on. pp. 1-8. IEEE (2018)</ref>, the number of references<ref name="art50" /><ref name="art59" /><ref>Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P.: [[Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia]]. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries. pp. 295-304 (2009).</ref><ref>di Sciascio, C., Strohmaier, D., Errecalde, M., Veas, E.: Wikilyzer: interactive information quality assessment in wikipedia. In: Proceedings of the 22nd International Conference on Intelligent User Interfaces. pp. 377-388. ACM (2017)</ref>, images<ref>Wu, K., Zhu, Q., Zhao, Y., Zheng, H.: [[Mining the Factors Affecting the Quality of Wikipedia Articles]]. In: Information Science and Management Engineering (ISME), 2010 International Conference of. vol. 1, pp. 343-346. IEEE (2010)</ref><ref>Liu, J., Ram, S.: [[Using Big Data and Network Analysis to Understand Wikipedia Article Quality]]. Data & Knowledge Engineering (2018)</ref>, sections<ref>Blumenstock, J.E.: [[Size Matters: Word Count as a Measure of Quality on Wikipedia‎]]. In: WWW. pp. 1095-1096 (2008).</ref><ref>Lerner, J., Lomi, A.: [[Knowledge Categorization Affects Popularity and Quality of Wikipedia Articles‎]]. PloS one 13(1), e0190674 (2018)</ref>, links to the article, the number of facts<ref name="art6" /><ref>Lex, Elisabeth, Michael Voelske, Marcelo Errecalde, Edgardo Ferretti, Leticia Cagnina, Christopher Horn, Benno Stein, and Michael Granitzer. [[Measuring the Quality of Web Content Using Factual Information‎]]. In Proceedings of the 2nd joint WICOW/AIRWeb workshop on web quality, pp. 7-10. ACM, 2012.</ref>, visits, the number of editions and many others. There are also a number of linguistic parameters,<ref name="art5" /><ref name="art7" /> which depend on the considered language. Currently, in total, more than 300 parameters are used in studies, depending on the language version of Wikipedia and the complexity of the quality model. Some parameters, such as references (sources), can be evaluated additionally<ref name="art14" /> – we can not only count the quantity, but also assess how well-known and reliable sources are used in the Wikipedia article.
+
<!--T:15-->
 +
In this case, the quality is modeled as the probability of referring an article to one of two groups – Complete or Incomplete. The conclusion is made on the basis of analysis of various parameters (metrics): the length of the text<ref name="art61" /><ref name="art62" /><ref name="art63" /><ref name="art64" /><ref name="art65" /><ref name="art66" />, the number of references<ref name="art50" /><ref name="art59" /><ref name="art67" /><ref name="art68" />, images<ref name="art69" /><ref name="art70" />, sections<ref name="art71" /><ref name="art72" />, links to the article, the number of facts<ref name="art6" /><ref name="art73" />, visits, the number of editions and many others. There are also a number of linguistic parameters,<ref name="art5" /><ref name="art7" /> which depend on the considered language. Also it can be taken into the account measures that shows number of the links from external sources, such as Reddit, Facebook, Youtube, Twitter, Linkedin, VKontakte and other social services.<ref name="art31" /><ref name="art74" /> Additionality, we can take into account the reputation of users who edit Wikipedia articles.<ref name="art81" /><ref name="art82" />. To determine the experience of Wikipedia editors, special online tools can be useful, such as [[WikiTop]].
 +
 
 +
<!--T:16-->
 +
Currently, in total, more than 300 parameters (or measures) are used in studies, depending on the language version of Wikipedia and the complexity of the quality model. Some parameters, such as references (sources), can be evaluated additionally<ref name="art14" /> – we can not only count the quantity, but also assess how well-known and reliable sources are used in the Wikipedia article. Some measures can be obtained on the basis of expert opinions, which can be received from different sources, for example - [[WikiBest]] service.
  
== Where to get these parameters? ==
+
== Where to get these parameters? == <!--T:17-->
 
There are several sources – it can be a [[Wikimedia Downloads|backup copy of Wikipedia]], [[Wikipedia API|API service]], [[Wikimedia Toolforge|special tools]] and others.<ref name="art12" />
 
There are several sources – it can be a [[Wikimedia Downloads|backup copy of Wikipedia]], [[Wikipedia API|API service]], [[Wikimedia Toolforge|special tools]] and others.<ref name="art12" />
  
 +
<!--T:18-->
 
To get some parameters, you just need to send a request (query) to the appropriate API, for other parameters (especially linguistic ones) you need to use special libraries and parsers. A considerable part of the time, however, is spent writing your own tools (we’ll talk about this in separate articles).
 
To get some parameters, you just need to send a request (query) to the appropriate API, for other parameters (especially linguistic ones) you need to use special libraries and parsers. A considerable part of the time, however, is spent writing your own tools (we’ll talk about this in separate articles).
  
== Are there other ways for quality assessing of Wikipedia articles other than binary? ==
+
== Are there other ways for quality assessing of Wikipedia articles other than binary? == <!--T:19-->
 
Yes. Recent studies<ref name="art8" /><ref name="art9" /> propose the method for estimating articles on a scale from 0 to 100 in a continuous scale. Thus, an article can receive, for example, an estimate of 54.21. This method has been tested in 55 language versions. The results are available on the [[WikiRank]] service, which allows you to evaluate and compare the quality and popularity of Wikipedia articles in different languages. The method, of course, is not ideal, but works for locally known topics.<ref name="art9" />
 
Yes. Recent studies<ref name="art8" /><ref name="art9" /> propose the method for estimating articles on a scale from 0 to 100 in a continuous scale. Thus, an article can receive, for example, an estimate of 54.21. This method has been tested in 55 language versions. The results are available on the [[WikiRank]] service, which allows you to evaluate and compare the quality and popularity of Wikipedia articles in different languages. The method, of course, is not ideal, but works for locally known topics.<ref name="art9" />
  
== Are there ways of assessing the quality of some part of Wikipedia article? ==
+
== Are there ways of assessing the quality of some part of Wikipedia article? == <!--T:20-->
 
Of course. For example, one of the important elements of the article is the so-called “[[infobox]]”. This is a separate frame (table), which is often located at the top right of the article and shows the most important facts about the subject. So, there is no need to look for this information in the text – you can just look at this table. Evaluation of the quality of these infoboxes is devoted to individual studies.<ref name="art2" /><ref name="art11" /> There are also projects, such as [[Infoboxes.net]], which allow you to automatically compare the infoboxes in different language versions.
 
Of course. For example, one of the important elements of the article is the so-called “[[infobox]]”. This is a separate frame (table), which is often located at the top right of the article and shows the most important facts about the subject. So, there is no need to look for this information in the text – you can just look at this table. Evaluation of the quality of these infoboxes is devoted to individual studies.<ref name="art2" /><ref name="art11" /> There are also projects, such as [[Infoboxes.net]], which allow you to automatically compare the infoboxes in different language versions.
  
== Why do we need all this? ==
+
== Why do we need all this? == <!--T:21-->
 
Wikipedia is used often, but the information quality is not always checked. The proposed methods can simplify this task – if the article is bad, then the reader, knowing this, will be more careful in using its materials for decision making. On the other hand, the user can also see in which language the topic of interest is described better. And most importantly, modern techniques allow you to transfer information between different language versions. This means that you can automatically enrich the weak versions of Wikipedia with high-quality data from other language versions.<ref name="art10" /> This will also improve the quality of other semantic databases, for which Wikipedia is the main source of information. First of all, this is – [[DBpedia]], [[Wikidata]], [[YAGO2]] and others.
 
Wikipedia is used often, but the information quality is not always checked. The proposed methods can simplify this task – if the article is bad, then the reader, knowing this, will be more careful in using its materials for decision making. On the other hand, the user can also see in which language the topic of interest is described better. And most importantly, modern techniques allow you to transfer information between different language versions. This means that you can automatically enrich the weak versions of Wikipedia with high-quality data from other language versions.<ref name="art10" /> This will also improve the quality of other semantic databases, for which Wikipedia is the main source of information. First of all, this is – [[DBpedia]], [[Wikidata]], [[YAGO2]] and others.
  
== References ==  
+
== References == <!--T:22-->
 +
</translate>
 
<references>
 
<references>
 
<ref name="art1">Lewoniewski, W., Węcel, K., Abramowicz, W. (2016). [[Quality and Importance of Wikipedia Articles in Different Languages]]. In International Conference on Information and Software Technologies (pp. 613-624). Springer International Publishing.</ref>
 
<ref name="art1">Lewoniewski, W., Węcel, K., Abramowicz, W. (2016). [[Quality and Importance of Wikipedia Articles in Different Languages]]. In International Conference on Information and Software Technologies (pp. 613-624). Springer International Publishing.</ref>
 
<ref name="art2">Węcel, K., Lewoniewski, W. (2015). [[Modelling the Quality of Attributes in Wikipedia Infoboxes]]. In International Conference on Business Information Systems (pp. 308-320). Springer International Publishing.</ref>
 
<ref name="art2">Węcel, K., Lewoniewski, W. (2015). [[Modelling the Quality of Attributes in Wikipedia Infoboxes]]. In International Conference on Business Information Systems (pp. 308-320). Springer International Publishing.</ref>
<ref name="art3">Lewoniewski, W., Węcel, K., Abramowicz, W. (2015). Analiza porównawcza modeli jakości informacji w narodowych wersjach Wikipedii. Prace Naukowe/Uniwersytet Ekonomiczny w Katowicach, 133-154.</ref>
+
<ref name="art3">Lewoniewski, W., Węcel, K., Abramowicz, W. (2015). [[Comparative Analysis of Information Quality Models in the National Versions of Wikipedia]]. Prace Naukowe/Uniwersytet Ekonomiczny w Katowicach, pp. 133-154.</ref>
 
<ref name="art4">Lewoniewski, W., Węcel, K., Abramowicz, W. (2017), [[Comparative analysis of classification models for quality assessment of Wikipedia articles]], Matematyka i informatyka na usługach ekonomii, Wydawnictwo UEP Poznań, ISBN 9788374179386</ref>
 
<ref name="art4">Lewoniewski, W., Węcel, K., Abramowicz, W. (2017), [[Comparative analysis of classification models for quality assessment of Wikipedia articles]], Matematyka i informatyka na usługach ekonomii, Wydawnictwo UEP Poznań, ISBN 9788374179386</ref>
 
<ref name="art5">Khairova, N., Lewoniewski, W., Węcel, K. (2017). [[Estimating the Quality of Articles in Russian Wikipedia Using the Logical-Linguistic Model of Fact Extraction]]. In International Conference on Business Information Systems (pp. 28-40). Springer, Cham.</ref>
 
<ref name="art5">Khairova, N., Lewoniewski, W., Węcel, K. (2017). [[Estimating the Quality of Articles in Russian Wikipedia Using the Logical-Linguistic Model of Fact Extraction]]. In International Conference on Business Information Systems (pp. 28-40). Springer, Cham.</ref>
<ref name="art6">Lewoniewski, W., Khairova, N., Węcel, K., Stratiienko, N., &amp; Abramowicz, W. (2017). [[Using Morphological and Semantic Features for the Quality Assessment of Russian Wikipedia]]. In International Conference on Information and Software Technologies (pp. 550-560). Springer, Cham. DOI: 10.1007/978-3-319-67642-5_46</ref>
+
<ref name="art6">Lewoniewski, W., Khairova, N., Węcel, K., Stratiienko, N., Abramowicz, W. (2017). [[Using Morphological and Semantic Features for the Quality Assessment of Russian Wikipedia]]. In International Conference on Information and Software Technologies (pp. 550-560). Springer, Cham. DOI: 10.1007/978-3-319-67642-5_46</ref>
 
<ref name="art7">Lewoniewski, W., Wecel, K., Abramowicz, W. (2017). Determining Quality of Articles in Polish Wikipedia Based on Linguistic Features.</ref>
 
<ref name="art7">Lewoniewski, W., Wecel, K., Abramowicz, W. (2017). Determining Quality of Articles in Polish Wikipedia Based on Linguistic Features.</ref>
 
<ref name="art8">Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). [[Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles]]. In Informatics (Vol. 4, No. 4, p. 43). Multidisciplinary Digital Publishing Institute. DOI: 10.3390/informatics4040043</ref>
 
<ref name="art8">Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). [[Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles]]. In Informatics (Vol. 4, No. 4, p. 43). Multidisciplinary Digital Publishing Institute. DOI: 10.3390/informatics4040043</ref>
Line 71: Line 96:
 
<ref name="art10">Lewoniewski, W. (2017). [[Enrichment of Information in Multilingual Wikipedia Based on Quality Analysis]]. In International Conference on Business Information Systems (pp. 216-227). Springer, Cham. DOI: 10.1007/978-3-319-69023-0_19</ref>
 
<ref name="art10">Lewoniewski, W. (2017). [[Enrichment of Information in Multilingual Wikipedia Based on Quality Analysis]]. In International Conference on Business Information Systems (pp. 216-227). Springer, Cham. DOI: 10.1007/978-3-319-69023-0_19</ref>
 
<ref name="art11">Lewoniewski, W. (2017). [[Completeness and Reliability of Wikipedia Infoboxes in Various Languages]]. In International Conference on Business Information Systems (pp. 295-305). Springer, Cham. DOI: 10.1007/978-3-319-69023-0_25</ref>
 
<ref name="art11">Lewoniewski, W. (2017). [[Completeness and Reliability of Wikipedia Infoboxes in Various Languages]]. In International Conference on Business Information Systems (pp. 295-305). Springer, Cham. DOI: 10.1007/978-3-319-69023-0_25</ref>
<ref name="art12">Lewoniewski, W., Węcel, K., (2017), Cechy artykułów oraz metody ich ekstrakcji na potrzeby oceny jakości informacji w Wikipedii. Studia Oeconomica Posnaniensia 12/2017. DOI: 10.18559/SOEP.2017.12.7</ref>
+
<ref name="art12">Lewoniewski, W., Węcel, K.. (2017). [[Features of Wikipedia Articles and Their Extraction Methods for Automatic Information Quality Assessment]]. Studia Oeconomica Posnaniensia 12/2017. DOI: 10.18559/SOEP.2017.12.7</ref>
<ref name="art13">Lamek, A., Lewoniewski, W. (2017), Zastosowanie regresji logistycznej w ocenie jakości informacji na przykładzie Wikipedii. Studia Oeconomica Posnaniensia 12/2017. DOI: 10.18559/SOEP.2017.12.3</ref>
+
<ref name="art13">Lamek, A., Lewoniewski, W. (2017). [[Application Logistic Regression in Assessing the Quality of Information – Wikipedia Articles Case]]. Studia Oeconomica Posnaniensia 12/2017. DOI: 10.18559/SOEP.2017.12.3</ref>
 
<ref name="art14">Lewoniewski, W., Węcel, K., Abramowicz, W., (2017), [[Analysis of References Across Wikipedia Languages]]. Information and Software Technologies. ICIST 2017. DOI: 10.1007/978-3-319-67642-5_47</ref>
 
<ref name="art14">Lewoniewski, W., Węcel, K., Abramowicz, W., (2017), [[Analysis of References Across Wikipedia Languages]]. Information and Software Technologies. ICIST 2017. DOI: 10.1007/978-3-319-67642-5_47</ref>
 
<ref name="art20">Warncke-Wang, Morten, Dan Cosley, and John Riedl. [[Tell Me More: An Actionable Quality Model for Wikipedia]]. Proceedings of the 9th International Symposium on Open Collaboration. ACM, 2013.</ref>
 
<ref name="art20">Warncke-Wang, Morten, Dan Cosley, and John Riedl. [[Tell Me More: An Actionable Quality Model for Wikipedia]]. Proceedings of the 9th International Symposium on Open Collaboration. ACM, 2013.</ref>
 +
<ref name="art31">Lewoniewski, W., Härting, R. C., Wecel, K., Reichstein, C., Abramowicz, W. (2018). [[Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources]]. In International Conference on Information and Software Technologies (pp. 139-152). Springer, Cham</ref>
 
<ref name="art50">Warncke-Wang, M., Ayukaev, V. R., Hecht, B., & Terveen, L. G. (2015). [[The Success and Failure of Quality Improvement Projects in Peer Production Communities]]. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 743-756). ACM.</ref>
 
<ref name="art50">Warncke-Wang, M., Ayukaev, V. R., Hecht, B., & Terveen, L. G. (2015). [[The Success and Failure of Quality Improvement Projects in Peer Production Communities]]. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 743-756). ACM.</ref>
 
<ref name="art59">Soonthornphisaj, N., & Paengporn, P. (2017). [[Thai Wikipedia Article Quality Filtering Algorithm]]. In Proceedings of the International MultiConference of Engineers and Computer Scientists (Vol. 1).</ref>
 
<ref name="art59">Soonthornphisaj, N., & Paengporn, P. (2017). [[Thai Wikipedia Article Quality Filtering Algorithm]]. In Proceedings of the International MultiConference of Engineers and Computer Scientists (Vol. 1).</ref>
 +
<ref name="art61">Blumenstock, J. E. (2008). [[Automatically Assessing the Quality of Wikipedia Articles]]. Tech. rep.</ref>
 +
<ref name="art62">Conti, R., Marzini, E., Spognardi, A., Matteucci, I., Mori, P., Petrocchi, M. (2014). [[Maturity Assessment of Wikipedia Medical Articles]]. In: Computer-Based Medical Systems (CBMS), 2014 IEEE 27th International Symposium on. pp. 281-286. IEEE</ref>
 +
<ref name="art63">Yaari, E., Baruchson-Arbib, S., Bar-Ilan, J. (2011). [[Information Quality Assessment of Community Generated Content: A User Study of Wikipedia]]. Journal of Information Science 37(5), 487-498</ref>
 +
<ref name="art64">Dang, Q.V., Ignat, C.L. (2016). [[Measuring Quality of Collaboratively Edited Documents: The Case of Wikipedia]]. In: Collaboration and Internet Computing (CIC), 2016 IEEE 2nd International Conference on. pp. 266-275. IEEE</ref>
 +
<ref name="art65">Shen, A., Qi, J., Baldwin, T.. (2017) [[A Hybrid Model for Quality Assessment of Wikipedia Wrticles]]. In: Proceedings of the Australasian Language Technology Association Workshop 2017. pp. 43-52</ref>
 +
<ref name="art66">Zhang, S., Hu, Z., Zhang, C., Yu, K. (2018). [[History-Based Article Quality Assessment on Wikipedia]]. In: Big Data and Smart Computing (BigComp), 2018 IEEE International Conference on. pp. 1-8. IEEE</ref>
 +
<ref name="art67">Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P. (2009). [[Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia]]. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries. pp. 295-304</ref>
 +
<ref name="art68">di Sciascio, C., Strohmaier, D., Errecalde, M., Veas, E. (2017). [[Wikilyzer: Interactive Information Quality Assessment in Wikipedia]]. In: Proceedings of the 22nd International Conference on Intelligent User Interfaces. pp. 377-388. ACM</ref>
 +
<ref name="art69">Wu, K., Zhu, Q., Zhao, Y., Zheng, H. (2010). [[Mining the Factors Affecting the Quality of Wikipedia Articles]]. In: Information Science and Management Engineering (ISME), 2010 International Conference of. vol. 1, pp. 343-346. IEEE</ref>
 +
<ref name="art70">Liu, J., Ram, S. (2018). [[Using Big Data and Network Analysis to Understand Wikipedia Article Quality]]. Data & Knowledge Engineering</ref>
 +
<ref name="art71">Blumenstock, J.E. (2008). [[Size Matters: Word Count as a Measure of Quality on Wikipedia‎]]. In: WWW. pp. 1095-1096</ref>
 +
<ref name="art72">Lerner, J., Lomi, A. (2018). [[Knowledge Categorization Affects Popularity and Quality of Wikipedia Articles‎]]. PloS one 13(1), e0190674 </ref>
 +
<ref name="art73">Lex, E., Voelske, M., Errecalde, M., Ferretti, E., Cagnina, L., Horn, C., Stein, B., Granitzer, M. (2012) [[Measuring the Quality of Web Content Using Factual Information‎]]. In Proceedings of the 2nd joint WICOW/AIRWeb workshop on web quality, pp. 7-10. ACM</ref>
 +
<ref name="art74">Moyer, D., Carson, S. L., Dye, T. K., Carson, R. T., Goldbaum, D. (2015). [[Determining the Influence of Reddit Posts on Wikipedia Pageviews]]. In Proceedings of the Ninth International AAAI Conference on Web and Social Media.</ref>
 +
<ref name="art81">Wu, G., Harrigan, M., Cunningham, P. (2011). [[Characterizing Wikipedia Pages Using Edit Network Motif Profiles]]. In Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents, Glasgow, UK.</ref>
 +
<ref name="art82">Suzuki, Y., Nakamura, S. (2016). [[Assessing the Quality of Wikipedia Editors Through Crowdsourcing]]. In Proceedings of the 25th International Conference Companion on World Wide Web, Montreal, QC, Canada; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2016; pp. 1001–1006.</ref>
 +
<ref name="art100">Lewoniewski, W. (2019). [[Measures for Quality Assessment of Articles and Infoboxes in Multilingual Wikipedia]]. Lecture Notes in Business Information Processing, vol 339. Springer, Cham  (pp. 619-633)</ref>
 
</references>
 
</references>
  
  
 
[[Category:Main]]
 
[[Category:Main]]
 +
__NOTOC____NOEDITSECTION__

Revision as of 09:01, 11 June 2019

Other languages:
Deutsch • ‎English • ‎español • ‎français • ‎polski • ‎русский
Welcome to Wikipedia Quality,
portal about concepts, researches and services related to quality assessment of the Multilingual Wikipedia.
Articles count: 6,470
Number of scientists in each country who conduct research on the Wikipedia Quality


Despite this, in Wikipedia you can sometimes find valuable information – depending on the language version and subject. Practically in every language version there is a system of awards for the best articles. However, the number of these articles is relatively small (less than one percent). In some language versions, there are also other quality grades. However, the overwhelming majority of articles have are unevaluated (in some languages more than 99%).

Automatic quality assessment of Wikipedia articles

So, in Wikipedia, many articles do not have quality grades, so each reader should manually analyze their content. Automatic quality assessment of Wikipedia articles is known and wide area in the scientific world - researchers from over 50 countries published various works related to quality of Wikipedia. Basically, the scientific works describes the most developed language version of Wikipedia – English, which already contains more than 5.7 million articles.

Since it foundation and with the growing popularity of Wikipedia, more and more scientific publications on this subject have published. One of the first studies showed that measuring the volume of content can help determine the degree of “maturity” of the Wikipedia article. Works in this direction shows that, in general, higher-quality articles are long, use many references, are edited by hundreds of authors and have thousands of editions.

There are different measures related to such quality dimensions as credibility, completeness, objectivity, readability, relevance, style and timeliness.[1]

How do they come to such conclusions? Simply put: comparing good and bad articles.

As already mentioned earlier, in almost every language version of Wikipedia, there is a system of assessing the quality of articles. The best articles are awarded in a special way – they receive a special “badge”. In Russian Wikipedia such articles are called “Featured Articles” (FA). There is another “badge” for articles that slightly below the best ones – “Good articles” (GA). In some language versions, there are other estimates for more “weak” articles. For example, in English Wikipedia there are also: A-class, B-class, C-class, Start, Stub. On the other hand in Russian Wikipedia we can met the following additional grades: Solid, Full, Developed, In development, Stub.

Even on the example of the English and Russian versions, we can conclude that the standards for the grading scale are different and depends on the language. Moreover, not all language versions of Wikipedia have such a developed system of quality assessment of articles. For example, German Wikipedia, which contains more than 2 million articles, uses only two estimates – equivalents for FA and GA. Therefore, often assessments in scientific papers are grouped into two groups:[2][3][4][5][6][7][8][9]

  • ”Complete” – FA and GA grade,
  • ”Incomplete” – all other grades

Let’s call this method “binary” (1 – Complete articles, 0 – Incomplete articles). This separation naturally “blurs” the boundaries between individual classes, but it allows you to build and compare quality models for different language versions of Wikipedia.

Data Mining

To build such models, you can use various algorithms, in particular Data Mining. One of the most commonly used algorithms – Random Forest[2][3][4][5][7][8][9][6]. There are even studies[5], which compare it with other algorithms (CART, SMO, Multilayer Perceptron, LMT, C4.5, C5.0 and others). Random Forest allows to build models even using variables that correlates with each other. Additionally, this algorithm can show which variables are more important for determining the quality of articles. If we need to get other information about the importance of variables, we can use other algorithms, including logistic regression.[10]

The results show that there are differences between article quality models in different language versions of Wikipedia.[2][3][4][5] So, if in one language version one of the most important parameters is the number of references (sources), in another language will be more important the number of images and the length of the text.

In this case, the quality is modeled as the probability of referring an article to one of two groups – Complete or Incomplete. The conclusion is made on the basis of analysis of various parameters (metrics): the length of the text[11][12][13][14][15][16], the number of references[17][18][19][20], images[21][22], sections[23][24], links to the article, the number of facts[8][25], visits, the number of editions and many others. There are also a number of linguistic parameters,[7][9] which depend on the considered language. Also it can be taken into the account measures that shows number of the links from external sources, such as Reddit, Facebook, Youtube, Twitter, Linkedin, VKontakte and other social services.[26][27] Additionality, we can take into account the reputation of users who edit Wikipedia articles.[28][29]. To determine the experience of Wikipedia editors, special online tools can be useful, such as WikiTop.

Currently, in total, more than 300 parameters (or measures) are used in studies, depending on the language version of Wikipedia and the complexity of the quality model. Some parameters, such as references (sources), can be evaluated additionally[30] – we can not only count the quantity, but also assess how well-known and reliable sources are used in the Wikipedia article. Some measures can be obtained on the basis of expert opinions, which can be received from different sources, for example - WikiBest service.

Where to get these parameters?

There are several sources – it can be a backup copy of Wikipedia, API service, special tools and others.[31]

To get some parameters, you just need to send a request (query) to the appropriate API, for other parameters (especially linguistic ones) you need to use special libraries and parsers. A considerable part of the time, however, is spent writing your own tools (we’ll talk about this in separate articles).

Are there other ways for quality assessing of Wikipedia articles other than binary?

Yes. Recent studies[32][33] propose the method for estimating articles on a scale from 0 to 100 in a continuous scale. Thus, an article can receive, for example, an estimate of 54.21. This method has been tested in 55 language versions. The results are available on the WikiRank service, which allows you to evaluate and compare the quality and popularity of Wikipedia articles in different languages. The method, of course, is not ideal, but works for locally known topics.[33]

Are there ways of assessing the quality of some part of Wikipedia article?

Of course. For example, one of the important elements of the article is the so-called “infobox”. This is a separate frame (table), which is often located at the top right of the article and shows the most important facts about the subject. So, there is no need to look for this information in the text – you can just look at this table. Evaluation of the quality of these infoboxes is devoted to individual studies.[3][34] There are also projects, such as Infoboxes.net, which allow you to automatically compare the infoboxes in different language versions.

Why do we need all this?

Wikipedia is used often, but the information quality is not always checked. The proposed methods can simplify this task – if the article is bad, then the reader, knowing this, will be more careful in using its materials for decision making. On the other hand, the user can also see in which language the topic of interest is described better. And most importantly, modern techniques allow you to transfer information between different language versions. This means that you can automatically enrich the weak versions of Wikipedia with high-quality data from other language versions.[35] This will also improve the quality of other semantic databases, for which Wikipedia is the main source of information. First of all, this is – DBpedia, Wikidata, YAGO2 and others.

References

  1. 1.0 1.1 Lewoniewski, W. (2019). Measures for Quality Assessment of Articles and Infoboxes in Multilingual Wikipedia. Lecture Notes in Business Information Processing, vol 339. Springer, Cham (pp. 619-633)
  2. 2.0 2.1 2.2 Lewoniewski, W., Węcel, K., Abramowicz, W. (2016). Quality and Importance of Wikipedia Articles in Different Languages. In International Conference on Information and Software Technologies (pp. 613-624). Springer International Publishing.
  3. 3.0 3.1 3.2 3.3 Węcel, K., Lewoniewski, W. (2015). Modelling the Quality of Attributes in Wikipedia Infoboxes. In International Conference on Business Information Systems (pp. 308-320). Springer International Publishing.
  4. 4.0 4.1 4.2 Lewoniewski, W., Węcel, K., Abramowicz, W. (2015). Comparative Analysis of Information Quality Models in the National Versions of Wikipedia. Prace Naukowe/Uniwersytet Ekonomiczny w Katowicach, pp. 133-154.
  5. 5.0 5.1 5.2 5.3 Lewoniewski, W., Węcel, K., Abramowicz, W. (2017), Comparative analysis of classification models for quality assessment of Wikipedia articles, Matematyka i informatyka na usługach ekonomii, Wydawnictwo UEP Poznań, ISBN 9788374179386
  6. 6.0 6.1 Warncke-Wang, Morten, Dan Cosley, and John Riedl. Tell Me More: An Actionable Quality Model for Wikipedia. Proceedings of the 9th International Symposium on Open Collaboration. ACM, 2013.
  7. 7.0 7.1 7.2 Khairova, N., Lewoniewski, W., Węcel, K. (2017). Estimating the Quality of Articles in Russian Wikipedia Using the Logical-Linguistic Model of Fact Extraction. In International Conference on Business Information Systems (pp. 28-40). Springer, Cham.
  8. 8.0 8.1 8.2 Lewoniewski, W., Khairova, N., Węcel, K., Stratiienko, N., Abramowicz, W. (2017). Using Morphological and Semantic Features for the Quality Assessment of Russian Wikipedia. In International Conference on Information and Software Technologies (pp. 550-560). Springer, Cham. DOI: 10.1007/978-3-319-67642-5_46
  9. 9.0 9.1 9.2 Lewoniewski, W., Wecel, K., Abramowicz, W. (2017). Determining Quality of Articles in Polish Wikipedia Based on Linguistic Features.
  10. Lamek, A., Lewoniewski, W. (2017). Application Logistic Regression in Assessing the Quality of Information – Wikipedia Articles Case. Studia Oeconomica Posnaniensia 12/2017. DOI: 10.18559/SOEP.2017.12.3
  11. Blumenstock, J. E. (2008). Automatically Assessing the Quality of Wikipedia Articles. Tech. rep.
  12. Conti, R., Marzini, E., Spognardi, A., Matteucci, I., Mori, P., Petrocchi, M. (2014). Maturity Assessment of Wikipedia Medical Articles. In: Computer-Based Medical Systems (CBMS), 2014 IEEE 27th International Symposium on. pp. 281-286. IEEE
  13. Yaari, E., Baruchson-Arbib, S., Bar-Ilan, J. (2011). Information Quality Assessment of Community Generated Content: A User Study of Wikipedia. Journal of Information Science 37(5), 487-498
  14. Dang, Q.V., Ignat, C.L. (2016). Measuring Quality of Collaboratively Edited Documents: The Case of Wikipedia. In: Collaboration and Internet Computing (CIC), 2016 IEEE 2nd International Conference on. pp. 266-275. IEEE
  15. Shen, A., Qi, J., Baldwin, T.. (2017) A Hybrid Model for Quality Assessment of Wikipedia Wrticles. In: Proceedings of the Australasian Language Technology Association Workshop 2017. pp. 43-52
  16. Zhang, S., Hu, Z., Zhang, C., Yu, K. (2018). History-Based Article Quality Assessment on Wikipedia. In: Big Data and Smart Computing (BigComp), 2018 IEEE International Conference on. pp. 1-8. IEEE
  17. Warncke-Wang, M., Ayukaev, V. R., Hecht, B., & Terveen, L. G. (2015). The Success and Failure of Quality Improvement Projects in Peer Production Communities. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing (pp. 743-756). ACM.
  18. Soonthornphisaj, N., & Paengporn, P. (2017). Thai Wikipedia Article Quality Filtering Algorithm. In Proceedings of the International MultiConference of Engineers and Computer Scientists (Vol. 1).
  19. Dalip, D.H., Gonçalves, M.A., Cristo, M., Calado, P. (2009). Automatic Quality Assessment of Content Created Collaboratively by Web Communities: A Case Study of Wikipedia. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries. pp. 295-304
  20. di Sciascio, C., Strohmaier, D., Errecalde, M., Veas, E. (2017). Wikilyzer: Interactive Information Quality Assessment in Wikipedia. In: Proceedings of the 22nd International Conference on Intelligent User Interfaces. pp. 377-388. ACM
  21. Wu, K., Zhu, Q., Zhao, Y., Zheng, H. (2010). Mining the Factors Affecting the Quality of Wikipedia Articles. In: Information Science and Management Engineering (ISME), 2010 International Conference of. vol. 1, pp. 343-346. IEEE
  22. Liu, J., Ram, S. (2018). Using Big Data and Network Analysis to Understand Wikipedia Article Quality. Data & Knowledge Engineering
  23. Blumenstock, J.E. (2008). Size Matters: Word Count as a Measure of Quality on Wikipedia‎. In: WWW. pp. 1095-1096
  24. Lerner, J., Lomi, A. (2018). Knowledge Categorization Affects Popularity and Quality of Wikipedia Articles‎. PloS one 13(1), e0190674
  25. Lex, E., Voelske, M., Errecalde, M., Ferretti, E., Cagnina, L., Horn, C., Stein, B., Granitzer, M. (2012) Measuring the Quality of Web Content Using Factual Information‎. In Proceedings of the 2nd joint WICOW/AIRWeb workshop on web quality, pp. 7-10. ACM
  26. Lewoniewski, W., Härting, R. C., Wecel, K., Reichstein, C., Abramowicz, W. (2018). Application of SEO Metrics to Determine the Quality of Wikipedia Articles and Their Sources. In International Conference on Information and Software Technologies (pp. 139-152). Springer, Cham
  27. Moyer, D., Carson, S. L., Dye, T. K., Carson, R. T., Goldbaum, D. (2015). Determining the Influence of Reddit Posts on Wikipedia Pageviews. In Proceedings of the Ninth International AAAI Conference on Web and Social Media.
  28. Wu, G., Harrigan, M., Cunningham, P. (2011). Characterizing Wikipedia Pages Using Edit Network Motif Profiles. In Proceedings of the 3rd International Workshop on Search and Mining User-generated Contents, Glasgow, UK.
  29. Suzuki, Y., Nakamura, S. (2016). Assessing the Quality of Wikipedia Editors Through Crowdsourcing. In Proceedings of the 25th International Conference Companion on World Wide Web, Montreal, QC, Canada; International World Wide Web Conferences Steering Committee: Geneva, Switzerland, 2016; pp. 1001–1006.
  30. Lewoniewski, W., Węcel, K., Abramowicz, W., (2017), Analysis of References Across Wikipedia Languages. Information and Software Technologies. ICIST 2017. DOI: 10.1007/978-3-319-67642-5_47
  31. Lewoniewski, W., Węcel, K.. (2017). Features of Wikipedia Articles and Their Extraction Methods for Automatic Information Quality Assessment. Studia Oeconomica Posnaniensia 12/2017. DOI: 10.18559/SOEP.2017.12.7
  32. Lewoniewski, W., Węcel, K., Abramowicz, W. (2017). Relative Quality and Popularity Evaluation of Multilingual Wikipedia Articles. In Informatics (Vol. 4, No. 4, p. 43). Multidisciplinary Digital Publishing Institute. DOI: 10.3390/informatics4040043
  33. 33.0 33.1 Lewoniewski, W., Węcel, K. (2017). Relative Quality Assessment of Wikipedia Articles in Different Languages Using Synthetic Measure. In International Conference on Business Information Systems (pp. 282-292). Springer, Cham. DOI: 10.1007/978-3-319-69023-0_24
  34. Lewoniewski, W. (2017). Completeness and Reliability of Wikipedia Infoboxes in Various Languages. In International Conference on Business Information Systems (pp. 295-305). Springer, Cham. DOI: 10.1007/978-3-319-69023-0_25
  35. Lewoniewski, W. (2017). Enrichment of Information in Multilingual Wikipedia Based on Quality Analysis. In International Conference on Business Information Systems (pp. 216-227). Springer, Cham. DOI: 10.1007/978-3-319-69023-0_19