Tagtheweb: Using Wikipedia Categories to Automatically Categorize Resources on the Web

From Wikipedia Quality
Revision as of 12:57, 24 November 2019 by Hazel (talk | contribs) (+ wikilinks)
Jump to: navigation, search

Tagtheweb: Using Wikipedia Categories to Automatically Categorize Resources on the Web - scientific work related to Wikipedia quality published in 2018, written by Jerry Fernandes Medeiros, Bernardo Pereira Nunes, Sean W. M. Siqueira and Luiz André P. Paes Leme.

Overview

Identifying topics associated with a set of documents is a common task for many applications and can be used to improve various tasks involving documents on the Web, such as search, retrieval, recommendation, and clustering. To address this problem, this paper introduces a tool, called TagTheWeb, as a proposition of a generic classification method, that relies on the knowledge expressed by the taxonomic structure of Wikipedia, based on the generation of a fingerprint through the semantic relation between nodes of the Wikipedia Category Graph. TagTheWeb can be used as a WEB interface or as an API to classify any text based resource.