Playscript Classification and Automatic Wikipedia Play Articles Generation

From Wikipedia Quality
Revision as of 22:02, 29 May 2019 by Charlotte (talk | contribs) (Links)
Jump to: navigation, search

Playscript Classification and Automatic Wikipedia Play Articles Generation - scientific work related to Wikipedia quality published in 2014, written by Siddhartha Banerjee, Cornelia Caragea and Prasenjit Mitra.

Overview

In this work, authors aim to create Wikipedia pages on plays automatically by extracting relevant information from various web sources. Authors approach involves building an efficient classifier that can classify web documents as play scripts. From the set of correctly classified instances of play scripts, authors extract relevant play-related information from the documents and use it to obtain additional information from various sources on the web. This information is aggregated and human-readable Wikipedia pages are created using a bot. The results of experiments show that classifiers trained by combining designed features along with "bag-of-words" (bow) features outperform classifiers trained using only bow features. Authors approach further shows that good quality human-readable pages can be created using bot. Such automatic page generation process can eventually ensure a more complete Wikipedia.