Extracting Corpus Specific Knowledge Bases from Wikipedia

From Wikipedia Quality
Revision as of 09:11, 15 May 2020 by Aria (talk | contribs) (Cats.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Extracting Corpus Specific Knowledge Bases from Wikipedia
Authors
David N. Milne
Ian H. Witten
David M. Nichols
Publication date
2007
Links
Original

Extracting Corpus Specific Knowledge Bases from Wikipedia - scientific work related to Wikipedia quality published in 2007, written by David N. Milne, Ian H. Witten and David M. Nichols.

Overview

Thesauri are useful knowledge structures for assisting information retrieval. Yet their production is labor-intensive, and few domains have comprehensive thesauri that cover domain-specific concepts and contemporary usage. One approach, which has been attempted without much success for decades, is to seek statistical natural language processing algorithms that work on free text. Instead, authors propose to replace costly professional indexers with thousands of dedicated amateur volunteers—namely, those that are producing Wikipedia. This vast, open encyclopedia represents a rich tapestry of topics and semantics and a huge investment of human effort and judgment. Authors show how this can be directly exploited to provide WikiSauri: manually-defined yet inexpensive thesaurus structures that are specifically tailored to expose the topics, terminology and semantics of individual document collections. Authors also offer concrete evidence of the effectiveness of WikiSauri for assisting information retrieval.

Embed

Wikipedia Quality

Milne, David N.; Witten, Ian H.; Nichols, David M.. (2007). "[[Extracting Corpus Specific Knowledge Bases from Wikipedia]]". University of Waikato, Department of Computer Science.

English Wikipedia

{{cite journal |last1=Milne |first1=David N. |last2=Witten |first2=Ian H. |last3=Nichols |first3=David M. |title=Extracting Corpus Specific Knowledge Bases from Wikipedia |date=2007 |url=https://wikipediaquality.com/wiki/Extracting_Corpus_Specific_Knowledge_Bases_from_Wikipedia |journal=University of Waikato, Department of Computer Science}}

HTML

Milne, David N.; Witten, Ian H.; Nichols, David M.. (2007). &quot;<a href="https://wikipediaquality.com/wiki/Extracting_Corpus_Specific_Knowledge_Bases_from_Wikipedia">Extracting Corpus Specific Knowledge Bases from Wikipedia</a>&quot;. University of Waikato, Department of Computer Science.