Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia

From Wikipedia Quality
Revision as of 10:03, 3 August 2019 by Sophie (talk | contribs) (Overview - Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia - scientific work related to Wikipedia quality published in 2014, written by Andry Luthfi, Bayu Distiawan and Ruli Manurung.

Overview

This paper describes the development of an Indonesian NER system using online data such as Wikipedia 1 and DBPedia 2. The system is based on the Stanford NER system [8] and utilizes training documents constructed automatically from Wikipedia. Each entity, i.e. word or phrase that has a hyperlink, in the Wikipedia documents are tagged according to information that is obtained from DBPedia. In this very first version, authors are only interested in three entities, namely: Person, Place, and Organization. The system is evaluated using cross fold validation and also evaluated using a gold standard that was manually annotated. Using cross validation evaluation, Indonesian NER managed to obtain precision and recall values above 90%, whereas the evaluation using gold standard shows that the Indonesian NER achieves high precision but very low recall.