Difference between revisions of "Topic Indexing with Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(Overview: Topic Indexing with Wikipedia)
 
(Int.links)
Line 1: Line 1:
'''Topic Indexing with Wikipedia''' - scientific work related to Wikipedia quality published in 2008, written by Olena Medelyan, Ian H. Witten and David N. Milne.
+
'''Topic Indexing with Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2008, written by [[Olena Medelyan]], [[Ian H. Witten]] and [[David N. Milne]].
  
 
== Overview ==
 
== Overview ==
Wikipedia article names can be utilized as a controlled vocabulary for identifying the main topics in a document. Wikipedia’s 2M articles cover the terminology of nearly any document collection, which permits controlled indexing in the absence of manually created vocabularies. Authors combine state-of-the-art strategies for automatic controlled indexing with Wikipedia’s unique property—a richly hyperlinked encyclopedia. Authors evaluate the scheme by comparing automatically assigned topics with those chosen manually by human indexers. Analysis of indexing consistency shows that algorithm outperforms some human subjects.
+
Wikipedia article names can be utilized as a controlled vocabulary for identifying the main topics in a document. [[Wikipedia]]’s 2M articles cover the terminology of nearly any document collection, which permits controlled indexing in the absence of manually created vocabularies. Authors combine state-of-the-art strategies for automatic controlled indexing with Wikipedia’s unique property—a richly hyperlinked encyclopedia. Authors evaluate the scheme by comparing automatically assigned topics with those chosen manually by human indexers. Analysis of indexing consistency shows that algorithm outperforms some human subjects.

Revision as of 09:44, 11 August 2019

Topic Indexing with Wikipedia - scientific work related to Wikipedia quality published in 2008, written by Olena Medelyan, Ian H. Witten and David N. Milne.

Overview

Wikipedia article names can be utilized as a controlled vocabulary for identifying the main topics in a document. Wikipedia’s 2M articles cover the terminology of nearly any document collection, which permits controlled indexing in the absence of manually created vocabularies. Authors combine state-of-the-art strategies for automatic controlled indexing with Wikipedia’s unique property—a richly hyperlinked encyclopedia. Authors evaluate the scheme by comparing automatically assigned topics with those chosen manually by human indexers. Analysis of indexing consistency shows that algorithm outperforms some human subjects.