Cross-Domain Text Classification Using Wikipedia

From Wikipedia Quality
Jump to: navigation, search


Cross-Domain Text Classification Using Wikipedia
Authors
Pu Wang
Carlotta Domeniconi
Jian Hu
Publication date
2008
Links
Original Preprint

Cross-Domain Text Classification Using Wikipedia - scientific work related to Wikipedia quality published in 2008, written by Pu Wang, Carlotta Domeniconi and Jian Hu.

Overview

Traditional approaches to document classification requires labeled data in order to construct reliable and accurate classifiers. Unfortunately, labeled data are seldom available, and often too expensive to obtain, especially for large domains and fast evolving scenarios. Given a learning task for which training data are not available, abundant labeled data may exist for a different but related domain. One would like to use the related labeled data as auxiliary information to accomplish the classification task in the target domain. Recently, the paradigm of transfer learning has been introduced to enable effective learning strategies when auxiliary data obey a different probability distribution. A co-clustering based classification algorithm has been previously proposed to tackle cross-domain text classification. In this work, authors extend the idea underlying this approach by making the latent semantic relationship between the two domains explicit. This goal is achieved with the use of Wikipedia. As a result, the pathway that allows to propagate labels between the two domains not only captures common words, but also semantic concepts based on the content of documents. Authors empirically demonstrate the efficacy of semantic-based approach to cross-domain classification using a variety of real data.

Embed

Wikipedia Quality

Wang, Pu; Domeniconi, Carlotta; Hu, Jian. (2008). "[[Cross-Domain Text Classification Using Wikipedia]]".

English Wikipedia

{{cite journal |last1=Wang |first1=Pu |last2=Domeniconi |first2=Carlotta |last3=Hu |first3=Jian |title=Cross-Domain Text Classification Using Wikipedia |date=2008 |url=https://wikipediaquality.com/wiki/Cross-Domain_Text_Classification_Using_Wikipedia}}

HTML

Wang, Pu; Domeniconi, Carlotta; Hu, Jian. (2008). &quot;<a href="https://wikipediaquality.com/wiki/Cross-Domain_Text_Classification_Using_Wikipedia">Cross-Domain Text Classification Using Wikipedia</a>&quot;.