Difference between revisions of "Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms"

From Wikipedia Quality
Jump to: navigation, search
(New study: Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms)
 
(Adding wikilinks)
Line 1: Line 1:
'''Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms''' - scientific work related to Wikipedia quality published in 2012, written by Arash Joorabchi and Abdulhussain E. Mahdi.
+
'''Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms''' - scientific work related to [[Wikipedia quality]] published in 2012, written by [[Arash Joorabchi]] and [[Abdulhussain E. Mahdi]].
  
 
== Overview ==
 
== Overview ==
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.
+
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes [[Wikipedia]] as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.

Revision as of 07:28, 29 August 2019

Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms - scientific work related to Wikipedia quality published in 2012, written by Arash Joorabchi and Abdulhussain E. Mahdi.

Overview

Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.