Difference between revisions of "Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms"

From Wikipedia Quality
Jump to: navigation, search
(Adding wikilinks)
(Infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms
 +
| date = 2012
 +
| authors = [[Arash Joorabchi]]<br />[[Abdulhussain E. Mahdi]]
 +
| doi = 10.1007/978-3-642-33876-2_6
 +
| link = https://dl.acm.org/citation.cfm?id=2413951
 +
}}
 
'''Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms''' - scientific work related to [[Wikipedia quality]] published in 2012, written by [[Arash Joorabchi]] and [[Abdulhussain E. Mahdi]].
 
'''Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms''' - scientific work related to [[Wikipedia quality]] published in 2012, written by [[Arash Joorabchi]] and [[Abdulhussain E. Mahdi]].
  
 
== Overview ==
 
== Overview ==
 
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes [[Wikipedia]] as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.
 
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes [[Wikipedia]] as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.

Revision as of 08:34, 10 October 2019


Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms
Authors
Arash Joorabchi
Abdulhussain E. Mahdi
Publication date
2012
DOI
10.1007/978-3-642-33876-2_6
Links
Original

Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms - scientific work related to Wikipedia quality published in 2012, written by Arash Joorabchi and Abdulhussain E. Mahdi.

Overview

Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.