Difference between revisions of "Information Extraction from Wikipedia Using Pattern Learning"

From Wikipedia Quality
Jump to: navigation, search
(Information Extraction from Wikipedia Using Pattern Learning - basic info)
 
(Adding wikilinks)
Line 1: Line 1:
'''Information Extraction from Wikipedia Using Pattern Learning''' - scientific work related to Wikipedia quality published in 2010, written by Márton Miháltz.
+
'''Information Extraction from Wikipedia Using Pattern Learning''' - scientific work related to [[Wikipedia quality]] published in 2010, written by [[Márton Miháltz]].
  
 
== Overview ==
 
== Overview ==
In this paper authors present solutions for the crucial task of extracting structured information from massive free-text resources, such as Wikipedia, for the sake of semantic databases serving upcoming Semantic Web technologies. Authors demonstrate both a verb frame-based approach using deep natural language processing techniques with extraction patterns developed by human knowledge experts and machine learning methods using shallow linguistic processing. Authors also propose a method for learning verb frame-based extraction patterns automatically from labeled data. Authors show that labeled training data can be produced with only minimal human effort by utilizing existing semantic resources and the special characteristics of Wikipedia. Custom solutions for named entity recognition are also possible in this scenario. Authors present evaluation and comparison of the different approaches for several different relations.
+
In this paper authors present solutions for the crucial task of extracting [[structured information]] from massive free-text resources, such as [[Wikipedia]], for the sake of semantic databases serving upcoming Semantic Web technologies. Authors demonstrate both a verb frame-based approach using deep [[natural language processing]] techniques with extraction patterns developed by human knowledge experts and machine learning methods using shallow linguistic processing. Authors also propose a method for learning verb frame-based extraction patterns automatically from labeled data. Authors show that labeled training data can be produced with only minimal human effort by utilizing existing semantic resources and the special characteristics of Wikipedia. Custom solutions for [[named entity recognition]] are also possible in this scenario. Authors present evaluation and comparison of the different approaches for several different relations.

Revision as of 18:16, 19 October 2019

Information Extraction from Wikipedia Using Pattern Learning - scientific work related to Wikipedia quality published in 2010, written by Márton Miháltz.

Overview

In this paper authors present solutions for the crucial task of extracting structured information from massive free-text resources, such as Wikipedia, for the sake of semantic databases serving upcoming Semantic Web technologies. Authors demonstrate both a verb frame-based approach using deep natural language processing techniques with extraction patterns developed by human knowledge experts and machine learning methods using shallow linguistic processing. Authors also propose a method for learning verb frame-based extraction patterns automatically from labeled data. Authors show that labeled training data can be produced with only minimal human effort by utilizing existing semantic resources and the special characteristics of Wikipedia. Custom solutions for named entity recognition are also possible in this scenario. Authors present evaluation and comparison of the different approaches for several different relations.