Harvesting Domain-Specific Terms Using Wikipedia

From Wikipedia Quality
Revision as of 09:46, 13 June 2019 by Elena (talk | contribs) (Basic information on Harvesting Domain-Specific Terms Using Wikipedia)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Harvesting Domain-Specific Terms Using Wikipedia - scientific work related to Wikipedia quality published in 2011, written by Su Nam Kim, Lawrence Cavedon and Timothy Baldwin.

Overview

Authors present a simple but effective method of automatically extracting domain-specific terms using Wikipedia as training data (i.e. self-supervised learning). Authors first goal is to show, using human judgments, that Wikipedia categories are domainspecific and thus can replace manually annotated terms. Second, authors show that identifying such terms using harvested Wikipedia categories and entities as seeds is reliable when compared to the use of dictionary terms. Authors technique facilitates the construction of large semantic resources in multiple domains without requiring manually annotated training data.