Extending a Multilingual Lexical Resource by Bootstrapping Named Entity Classification Using Wikipedia's Category System

From Wikipedia Quality
Revision as of 08:16, 6 June 2019 by Jasmine (talk | contribs) (Adding new article - Extending a Multilingual Lexical Resource by Bootstrapping Named Entity Classification Using Wikipedia's Category System)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Extending a Multilingual Lexical Resource by Bootstrapping Named Entity Classification Using Wikipedia's Category System - scientific work related to Wikipedia quality published in 2011, written by .

Overview

Named Entity Recognition and Classification (NERC) is a well-studied NLP task which is typically approached using machine learning algorithms that rely on training data whose creation usually is expensive. The high costs result in the lack of NERC training data for many languages. An approach to create a multilingual NE corpus was presented in Wentland et al. (2008). The resulting resource called HeiNER describes a valuable number of NEs but does not include their types. Authors present a bootstrap approach based on Wikipedia’s category system to classify the NEs contained in HeiNER that is able to classify more than two million named entities to improve the resource’s quality.