Difference between revisions of "Ranking Wikipedia Article's Data Quality by Learning Dimension Distributions"
(Links) |
(infobox) |
||
Line 1: | Line 1: | ||
+ | {{Infobox work | ||
+ | | title = Ranking Wikipedia Article's Data Quality by Learning Dimension Distributions | ||
+ | | date = 2014 | ||
+ | | authors = [[Jingyu Han]]<br />[[Kejia Chen]] | ||
+ | | doi = 10.1504/IJIQ.2014.064056 | ||
+ | | link = https://www.inderscienceonline.com/doi/abs/10.1504/IJIQ.2014.064056 | ||
+ | }} | ||
'''Ranking Wikipedia Article's Data Quality by Learning Dimension Distributions''' - scientific work related to [[Wikipedia quality]] published in 2014, written by [[Jingyu Han]] and [[Kejia Chen]]. | '''Ranking Wikipedia Article's Data Quality by Learning Dimension Distributions''' - scientific work related to [[Wikipedia quality]] published in 2014, written by [[Jingyu Han]] and [[Kejia Chen]]. | ||
== Overview == | == Overview == | ||
As the largest free user-generated knowledge repository, [[data quality]] of [[Wikipedia]] has attracted great attention these years. Automatic assessment of Wikipedia article’s data quality is a pressing concern. Authors observe that every Wikipedia quality class exhibits its specific characteristic along different first-class quality dimensions including accuracy, [[completeness]], consistency and minimality. Authors propose to extract quality dimension values from article’s content and editing history using dynamic Bayesian network (DBN) and [[information extraction]] techniques. Next, authors employ multivariate Gaussian distributions to model quality dimension distributions for each quality class, and combine multiple trained classifiers to predict an article’s quality class, which can distinguish different quality classes effectively and robustly. Experiments demonstrate that approach generates a good performance. | As the largest free user-generated knowledge repository, [[data quality]] of [[Wikipedia]] has attracted great attention these years. Automatic assessment of Wikipedia article’s data quality is a pressing concern. Authors observe that every Wikipedia quality class exhibits its specific characteristic along different first-class quality dimensions including accuracy, [[completeness]], consistency and minimality. Authors propose to extract quality dimension values from article’s content and editing history using dynamic Bayesian network (DBN) and [[information extraction]] techniques. Next, authors employ multivariate Gaussian distributions to model quality dimension distributions for each quality class, and combine multiple trained classifiers to predict an article’s quality class, which can distinguish different quality classes effectively and robustly. Experiments demonstrate that approach generates a good performance. |
Revision as of 10:02, 9 January 2021
Authors | Jingyu Han Kejia Chen |
---|---|
Publication date | 2014 |
DOI | 10.1504/IJIQ.2014.064056 |
Links | Original |
Ranking Wikipedia Article's Data Quality by Learning Dimension Distributions - scientific work related to Wikipedia quality published in 2014, written by Jingyu Han and Kejia Chen.
Overview
As the largest free user-generated knowledge repository, data quality of Wikipedia has attracted great attention these years. Automatic assessment of Wikipedia article’s data quality is a pressing concern. Authors observe that every Wikipedia quality class exhibits its specific characteristic along different first-class quality dimensions including accuracy, completeness, consistency and minimality. Authors propose to extract quality dimension values from article’s content and editing history using dynamic Bayesian network (DBN) and information extraction techniques. Next, authors employ multivariate Gaussian distributions to model quality dimension distributions for each quality class, and combine multiple trained classifiers to predict an article’s quality class, which can distinguish different quality classes effectively and robustly. Experiments demonstrate that approach generates a good performance.