Difference between revisions of "Automatic Detection of Outdated Information in Wikipedia Infoboxes"

From Wikipedia Quality
Jump to: navigation, search
(Adding wikilinks)
(+ infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Automatic Detection of Outdated Information in Wikipedia Infoboxes
 +
| date = 2013
 +
| authors = [[Thong Tran]]<br />[[Tru H. Cao]]
 +
| link = http://www.rcs.cic.ipn.mx/2013_70/Automatic%20Detection%20of%20Outdated%20Information%20in%20Wikipedia%20Infoboxes.html
 +
}}
 
'''Automatic Detection of Outdated Information in Wikipedia Infoboxes''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Thong Tran]] and [[Tru H. Cao]].
 
'''Automatic Detection of Outdated Information in Wikipedia Infoboxes''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Thong Tran]] and [[Tru H. Cao]].
  
 
== Overview ==
 
== Overview ==
 
An infobox of a [[Wikipedia]] article generally contains key facts in the article and is organized as attribute-value pairs. Infoboxes not only allow read- ers to rapidly gather the most important information about some aspects of the articles in which they appear, but also provide a source for many knowledge ba- ses derived from Wikipedia. However, not all the values of infobox attributes are updated frequently and accurately. In this paper, authors propose a method to au- tomatically detect outdated attribute values in Wikipedia [[infoboxes]] by using facts extracted from the general Web. Authors method uses the pattern-based fact extraction approach. The patterns for fact extraction are automatically learned using a number of available seeds in related Wikipedia infoboxes. Authors have tested and evaluated system on a set of 100 well-established com- panies in the NASDAQ-100 index on their employee numbers, presented by the num_employees attribute value in their Wikipedia article infoboxes. The achieved accuracy is 77% and test result also reveals that 82% of the companies do not have their latest numbers of employees in their Wikipedia article infoboxes.
 
An infobox of a [[Wikipedia]] article generally contains key facts in the article and is organized as attribute-value pairs. Infoboxes not only allow read- ers to rapidly gather the most important information about some aspects of the articles in which they appear, but also provide a source for many knowledge ba- ses derived from Wikipedia. However, not all the values of infobox attributes are updated frequently and accurately. In this paper, authors propose a method to au- tomatically detect outdated attribute values in Wikipedia [[infoboxes]] by using facts extracted from the general Web. Authors method uses the pattern-based fact extraction approach. The patterns for fact extraction are automatically learned using a number of available seeds in related Wikipedia infoboxes. Authors have tested and evaluated system on a set of 100 well-established com- panies in the NASDAQ-100 index on their employee numbers, presented by the num_employees attribute value in their Wikipedia article infoboxes. The achieved accuracy is 77% and test result also reveals that 82% of the companies do not have their latest numbers of employees in their Wikipedia article infoboxes.

Revision as of 08:34, 23 October 2020


Automatic Detection of Outdated Information in Wikipedia Infoboxes
Authors
Thong Tran
Tru H. Cao
Publication date
2013
Links
Original

Automatic Detection of Outdated Information in Wikipedia Infoboxes - scientific work related to Wikipedia quality published in 2013, written by Thong Tran and Tru H. Cao.

Overview

An infobox of a Wikipedia article generally contains key facts in the article and is organized as attribute-value pairs. Infoboxes not only allow read- ers to rapidly gather the most important information about some aspects of the articles in which they appear, but also provide a source for many knowledge ba- ses derived from Wikipedia. However, not all the values of infobox attributes are updated frequently and accurately. In this paper, authors propose a method to au- tomatically detect outdated attribute values in Wikipedia infoboxes by using facts extracted from the general Web. Authors method uses the pattern-based fact extraction approach. The patterns for fact extraction are automatically learned using a number of available seeds in related Wikipedia infoboxes. Authors have tested and evaluated system on a set of 100 well-established com- panies in the NASDAQ-100 index on their employee numbers, presented by the num_employees attribute value in their Wikipedia article infoboxes. The achieved accuracy is 77% and test result also reveals that 82% of the companies do not have their latest numbers of employees in their Wikipedia article infoboxes.