Difference between revisions of "Automatic Keyphrase Annotation of Scientific Documents Using Wikipedia and Genetic Algorithms"

Revision as of 15:12, 22 March 2021

Automatic Keyphrase Annotation of Scientific Documents Using Wikipedia and Genetic Algorithms
Authors	Arash Joorabchi Abdulhussain E. Mahdi
Publication date	2013
DOI	10.1177/0165551512472138
Links	Original

Automatic Keyphrase Annotation of Scientific Documents Using Wikipedia and Genetic Algorithms - scientific work related to Wikipedia quality published in 2013, written by Arash Joorabchi and Abdulhussain E. Mahdi.

Overview

Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents to both human readers and information retrieval systems. This article describes a machine learning-based keyphrase annotation method for scientific documents that utilizes Wikipedia as a thesaurus for candidate selection from documents' content. Authors have devised a set of 20 statistical, positional and semantical features for candidate phrases to capture and reflect various properties of those candidates that have the highest keyphraseness probability. Authors first introduce a simple unsupervised method for ranking and filtering the most probable keyphrases, and then evolve it into a novel supervised method using genetic algorithms. Authors have evaluated the performance of both methods on a third-party dataset of research papers. Reported experimental results show that the performance of proposed methods, measured in terms of consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised and unsupervised methods.

@@ Line 1: / Line 1: @@
+{{Infobox work
+| title = Automatic Keyphrase Annotation of Scientific Documents Using Wikipedia and Genetic Algorithms
+| date = 2013
+| authors = [[Arash Joorabchi]]<br />[[Abdulhussain E. Mahdi]]
+| doi = 10.1177/0165551512472138
+| link = http://dl.acm.org/citation.cfm?id=2493909.2493911
+}}
 '''Automatic Keyphrase Annotation of Scientific Documents Using Wikipedia and Genetic Algorithms''' - scientific work related to [[Wikipedia quality]] published in 2013, written by [[Arash Joorabchi]] and [[Abdulhussain E. Mahdi]].
 == Overview ==
 Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents to both human readers and [[information retrieval]] systems. This article describes a machine learning-based keyphrase annotation method for scientific documents that utilizes [[Wikipedia]] as a thesaurus for candidate selection from documents' content. Authors have devised a set of 20 statistical, positional and semantical [[features]] for candidate phrases to capture and reflect various properties of those candidates that have the highest keyphraseness probability. Authors first introduce a simple unsupervised method for ranking and filtering the most probable keyphrases, and then evolve it into a novel supervised method using genetic algorithms. Authors have evaluated the performance of both methods on a third-party dataset of research papers. Reported experimental results show that the performance of proposed methods, measured in terms of consistency with human annotators, is on a par with that achieved by humans and outperforms rival supervised and unsupervised methods.

Difference between revisions of "Automatic Keyphrase Annotation of Scientific Documents Using Wikipedia and Genetic Algorithms"

Revision as of 15:12, 22 March 2021

Overview

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools