Difference between revisions of "Automatic Detection of Outdated Information in Wikipedia Infoboxes"
(Adding wikilinks) |
(+ infobox) |
||
Line 1: | Line 1: | ||
+ | {{Infobox work | ||
+ | | title = Automatic Detection of Outdated Information in Wikipedia Infoboxes | ||
+ | | date = 2013 | ||
+ | | authors = [[Thong Tran]]<br />[[Tru H. Cao]] | ||
+ | | link = http://www.rcs.cic.ipn.mx/2013_70/Automatic%20Detection%20of%20Outdated%20Information%20in%20Wikipedia%20Infoboxes.html | ||
+ | }} | ||
'''Automatic Detection of Outdated Information in Wikipedia Infoboxes''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Thong Tran]] and [[Tru H. Cao]]. | '''Automatic Detection of Outdated Information in Wikipedia Infoboxes''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Thong Tran]] and [[Tru H. Cao]]. | ||
== Overview == | == Overview == | ||
An infobox of a [[Wikipedia]] article generally contains key facts in the article and is organized as attribute-value pairs. Infoboxes not only allow read- ers to rapidly gather the most important information about some aspects of the articles in which they appear, but also provide a source for many knowledge ba- ses derived from Wikipedia. However, not all the values of infobox attributes are updated frequently and accurately. In this paper, authors propose a method to au- tomatically detect outdated attribute values in Wikipedia [[infoboxes]] by using facts extracted from the general Web. Authors method uses the pattern-based fact extraction approach. The patterns for fact extraction are automatically learned using a number of available seeds in related Wikipedia infoboxes. Authors have tested and evaluated system on a set of 100 well-established com- panies in the NASDAQ-100 index on their employee numbers, presented by the num_employees attribute value in their Wikipedia article infoboxes. The achieved accuracy is 77% and test result also reveals that 82% of the companies do not have their latest numbers of employees in their Wikipedia article infoboxes. | An infobox of a [[Wikipedia]] article generally contains key facts in the article and is organized as attribute-value pairs. Infoboxes not only allow read- ers to rapidly gather the most important information about some aspects of the articles in which they appear, but also provide a source for many knowledge ba- ses derived from Wikipedia. However, not all the values of infobox attributes are updated frequently and accurately. In this paper, authors propose a method to au- tomatically detect outdated attribute values in Wikipedia [[infoboxes]] by using facts extracted from the general Web. Authors method uses the pattern-based fact extraction approach. The patterns for fact extraction are automatically learned using a number of available seeds in related Wikipedia infoboxes. Authors have tested and evaluated system on a set of 100 well-established com- panies in the NASDAQ-100 index on their employee numbers, presented by the num_employees attribute value in their Wikipedia article infoboxes. The achieved accuracy is 77% and test result also reveals that 82% of the companies do not have their latest numbers of employees in their Wikipedia article infoboxes. |
Revision as of 08:34, 23 October 2020
Authors | Thong Tran Tru H. Cao |
---|---|
Publication date | 2013 |
Links | Original |
Automatic Detection of Outdated Information in Wikipedia Infoboxes - scientific work related to Wikipedia quality published in 2013, written by Thong Tran and Tru H. Cao.
Overview
An infobox of a Wikipedia article generally contains key facts in the article and is organized as attribute-value pairs. Infoboxes not only allow read- ers to rapidly gather the most important information about some aspects of the articles in which they appear, but also provide a source for many knowledge ba- ses derived from Wikipedia. However, not all the values of infobox attributes are updated frequently and accurately. In this paper, authors propose a method to au- tomatically detect outdated attribute values in Wikipedia infoboxes by using facts extracted from the general Web. Authors method uses the pattern-based fact extraction approach. The patterns for fact extraction are automatically learned using a number of available seeds in related Wikipedia infoboxes. Authors have tested and evaluated system on a set of 100 well-established com- panies in the NASDAQ-100 index on their employee numbers, presented by the num_employees attribute value in their Wikipedia article infoboxes. The achieved accuracy is 77% and test result also reveals that 82% of the companies do not have their latest numbers of employees in their Wikipedia article infoboxes.