Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia
Building an Indonesian Named Entity Recognizer Using Wikipedia and Dbpedia - scientific work related to Wikipedia quality published in 2014, written by Andry Luthfi, Bayu Distiawan and Ruli Manurung.
This paper describes the development of an Indonesian NER system using online data such as Wikipedia 1 and DBPedia 2. The system is based on the Stanford NER system  and utilizes training documents constructed automatically from Wikipedia. Each entity, i.e. word or phrase that has a hyperlink, in the Wikipedia documents are tagged according to information that is obtained from DBPedia. In this very first version, authors are only interested in three entities, namely: Person, Place, and Organization. The system is evaluated using cross fold validation and also evaluated using a gold standard that was manually annotated. Using cross validation evaluation, Indonesian NER managed to obtain precision and recall values above 90%, whereas the evaluation using gold standard shows that the Indonesian NER achieves high precision but very low recall.