Difference between revisions of "Harvesting Domain-Specific Terms Using Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(Basic information on Harvesting Domain-Specific Terms Using Wikipedia)
 
(Links)
Line 1: Line 1:
'''Harvesting Domain-Specific Terms Using Wikipedia''' - scientific work related to Wikipedia quality published in 2011, written by Su Nam Kim, Lawrence Cavedon and Timothy Baldwin.
+
'''Harvesting Domain-Specific Terms Using Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2011, written by [[Su Nam Kim]], [[Lawrence Cavedon]] and [[Timothy Baldwin]].
  
 
== Overview ==
 
== Overview ==
Authors present a simple but effective method of automatically extracting domain-specific terms using Wikipedia as training data (i.e. self-supervised learning). Authors first goal is to show, using human judgments, that Wikipedia categories are domainspecific and thus can replace manually annotated terms. Second, authors show that identifying such terms using harvested Wikipedia categories and entities as seeds is reliable when compared to the use of dictionary terms. Authors technique facilitates the construction of large semantic resources in multiple domains without requiring manually annotated training data.
+
Authors present a simple but effective method of automatically extracting domain-specific terms using [[Wikipedia]] as training data (i.e. self-supervised learning). Authors first goal is to show, using human judgments, that [[Wikipedia categories]] are domainspecific and thus can replace manually annotated terms. Second, authors show that identifying such terms using harvested Wikipedia [[categories]] and entities as seeds is reliable when compared to the use of dictionary terms. Authors technique facilitates the construction of large semantic resources in multiple domains without requiring manually annotated training data.

Revision as of 08:00, 4 July 2019

Harvesting Domain-Specific Terms Using Wikipedia - scientific work related to Wikipedia quality published in 2011, written by Su Nam Kim, Lawrence Cavedon and Timothy Baldwin.

Overview

Authors present a simple but effective method of automatically extracting domain-specific terms using Wikipedia as training data (i.e. self-supervised learning). Authors first goal is to show, using human judgments, that Wikipedia categories are domainspecific and thus can replace manually annotated terms. Second, authors show that identifying such terms using harvested Wikipedia categories and entities as seeds is reliable when compared to the use of dictionary terms. Authors technique facilitates the construction of large semantic resources in multiple domains without requiring manually annotated training data.