Difference between revisions of "Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms"
(New study: Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms) |
(Adding wikilinks) |
||
Line 1: | Line 1: | ||
− | '''Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms''' - scientific work related to Wikipedia quality published in 2012, written by Arash Joorabchi and Abdulhussain E. Mahdi. | + | '''Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms''' - scientific work related to [[Wikipedia quality]] published in 2012, written by [[Arash Joorabchi]] and [[Abdulhussain E. Mahdi]]. |
== Overview == | == Overview == | ||
− | Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods. | + | Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes [[Wikipedia]] as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods. |
Revision as of 07:28, 29 August 2019
Automatic Subject Metadata Generation for Scientific Documents Using Wikipedia and Genetic Algorithms - scientific work related to Wikipedia quality published in 2012, written by Arash Joorabchi and Abdulhussain E. Mahdi.
Overview
Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents. However, scientific documents that are manually annotated with keyphrases are in the minority. This paper describes a machine learning-based automatic keyphrase annotation method for scientific documents, which utilizes Wikipedia as a thesaurus for candidate selection from documents' content and deploys genetic algorithms to learn a model for ranking and filtering the most probable keyphrases. Reported experimental results show that the performance of method, evaluated in terms of inter-consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised methods.