Difference between revisions of "Classifying Articles in English and German Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(Links)
(Adding infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Classifying Articles in English and German Wikipedia
 +
| date = 2009
 +
| authors = [[Nicky Ringland]]<br />[[Joel Nothman]]<br />[[Tara Murphy]]<br />[[James R. Curran]]
 +
| link = http://www.aclweb.org/anthology/U/U09/U09-1004.pdf
 +
}}
 
'''Classifying Articles in English and German Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2009, written by [[Nicky Ringland]], [[Joel Nothman]], [[Tara Murphy]] and [[James R. Curran]].
 
'''Classifying Articles in English and German Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2009, written by [[Nicky Ringland]], [[Joel Nothman]], [[Tara Murphy]] and [[James R. Curran]].
  
 
== Overview ==
 
== Overview ==
 
Named Entity (NE) information is critical for Information Extraction (IE) tasks. However, the cost of manually annotating sufficient data for training purposes, especially for [[multiple languages]], is prohibitive, meaning automated methods for developing resources are crucial. Authors investigate the automatic generation of NE annotated data in German from [[Wikipedia]]. By incorporating structural [[features]] of Wikipedia, authors can develop a German corpus which accurately classifies Wikipedia articles into NE [[categories]] to within 1% F -score of the state-of-the-art process in English.
 
Named Entity (NE) information is critical for Information Extraction (IE) tasks. However, the cost of manually annotating sufficient data for training purposes, especially for [[multiple languages]], is prohibitive, meaning automated methods for developing resources are crucial. Authors investigate the automatic generation of NE annotated data in German from [[Wikipedia]]. By incorporating structural [[features]] of Wikipedia, authors can develop a German corpus which accurately classifies Wikipedia articles into NE [[categories]] to within 1% F -score of the state-of-the-art process in English.

Revision as of 10:18, 4 August 2019


Classifying Articles in English and German Wikipedia
Authors
Nicky Ringland
Joel Nothman
Tara Murphy
James R. Curran
Publication date
2009
Links
Original

Classifying Articles in English and German Wikipedia - scientific work related to Wikipedia quality published in 2009, written by Nicky Ringland, Joel Nothman, Tara Murphy and James R. Curran.

Overview

Named Entity (NE) information is critical for Information Extraction (IE) tasks. However, the cost of manually annotating sufficient data for training purposes, especially for multiple languages, is prohibitive, meaning automated methods for developing resources are crucial. Authors investigate the automatic generation of NE annotated data in German from Wikipedia. By incorporating structural features of Wikipedia, authors can develop a German corpus which accurately classifies Wikipedia articles into NE categories to within 1% F -score of the state-of-the-art process in English.