Difference between revisions of "Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia"

From Wikipedia Quality
Jump to: navigation, search
(Overview - Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia)
 
(+ links)
Line 1: Line 1:
'''Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia''' - scientific work related to Wikipedia quality published in 2014, written by Andry Luthfi, Bayu Distiawan and Ruli Manurung.
+
'''Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia''' - scientific work related to [[Wikipedia quality]] published in 2014, written by [[Andry Luthfi]], [[Bayu Distiawan]] and [[Ruli Manurung]].
  
 
== Overview ==
 
== Overview ==
This paper describes the development of an Indonesian NER system using online data such as Wikipedia 1 and DBPedia 2. The system is based on the Stanford NER system [8] and utilizes training documents constructed automatically from Wikipedia. Each entity, i.e. word or phrase that has a hyperlink, in the Wikipedia documents are tagged according to information that is obtained from DBPedia. In this very first version, authors are only interested in three entities, namely: Person, Place, and Organization. The system is evaluated using cross fold validation and also evaluated using a gold standard that was manually annotated. Using cross validation evaluation, Indonesian NER managed to obtain precision and recall values above 90%, whereas the evaluation using gold standard shows that the Indonesian NER achieves high precision but very low recall.
+
This paper describes the development of an Indonesian NER system using online data such as [[Wikipedia]] 1 and DBPedia 2. The system is based on the Stanford NER system [8] and utilizes training documents constructed automatically from Wikipedia. Each entity, i.e. word or phrase that has a hyperlink, in the Wikipedia documents are tagged according to information that is obtained from DBPedia. In this very first version, authors are only interested in three entities, namely: Person, Place, and Organization. The system is evaluated using cross fold validation and also evaluated using a gold standard that was manually annotated. Using cross validation evaluation, Indonesian NER managed to obtain precision and recall values above 90%, whereas the evaluation using gold standard shows that the Indonesian NER achieves high precision but very low recall.

Revision as of 07:20, 16 January 2020

Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia - scientific work related to Wikipedia quality published in 2014, written by Andry Luthfi, Bayu Distiawan and Ruli Manurung.

Overview

This paper describes the development of an Indonesian NER system using online data such as Wikipedia 1 and DBPedia 2. The system is based on the Stanford NER system [8] and utilizes training documents constructed automatically from Wikipedia. Each entity, i.e. word or phrase that has a hyperlink, in the Wikipedia documents are tagged according to information that is obtained from DBPedia. In this very first version, authors are only interested in three entities, namely: Person, Place, and Organization. The system is evaluated using cross fold validation and also evaluated using a gold standard that was manually annotated. Using cross validation evaluation, Indonesian NER managed to obtain precision and recall values above 90%, whereas the evaluation using gold standard shows that the Indonesian NER achieves high precision but very low recall.