Difference between revisions of "Using Wikipedia Knowledge to Improve Text Classification"

Latest revision as of 00:36, 9 February 2021

Using Wikipedia Knowledge to Improve Text Classification
Authors	Pu Wang Jian Hu Hua-Jun Zeng Zheng Chen
Publication date	2009
DOI	10.1007/s10115-008-0152-4
Links	Original

Using Wikipedia Knowledge to Improve Text Classification - scientific work related to Wikipedia quality published in 2009, written by Pu Wang, Jian Hu, Hua-Jun Zeng and Zheng Chen.

Overview

Text classification has been widely used to assist users with the discovery of useful information from the Internet. However, traditional classification methods are based on the “Bag of Words” (BOW) representation, which only accounts for term frequency in the documents, and ignores important semantic relationships between key terms. To overcome this problem, previous work attempted to enrich text representation by means of manual intervention or automatic document expansion. The achieved improvement is unfortunately very limited, due to the poor coverage capability of the dictionary, and to the ineffectiveness of term expansion. In this paper, authors automatically construct a thesaurus of concepts from Wikipedia. Authors then introduce a unified framework to expand the BOW representation with semantic relations (synonymy, hyponymy, and associative relations), and demonstrate its efficacy in enhancing previous approaches for text classification. Experimental results on several data sets show that the proposed approach, integrated with the thesaurus built from Wikipedia, can achieve significant improvements with respect to the baseline algorithm.

Embed

Wikipedia Quality

Wang, Pu; Hu, Jian; Zeng, Hua-Jun; Chen, Zheng. (2009). "[[Using Wikipedia Knowledge to Improve Text Classification]]". Springer-Verlag. DOI: 10.1007/s10115-008-0152-4.

English Wikipedia

{{cite journal |last1=Wang |first1=Pu |last2=Hu |first2=Jian |last3=Zeng |first3=Hua-Jun |last4=Chen |first4=Zheng |title=Using Wikipedia Knowledge to Improve Text Classification |date=2009 |doi=10.1007/s10115-008-0152-4 |url=https://wikipediaquality.com/wiki/Using_Wikipedia_Knowledge_to_Improve_Text_Classification |journal=Springer-Verlag}}

HTML

Wang, Pu; Hu, Jian; Zeng, Hua-Jun; Chen, Zheng. (2009). "<a href="https://wikipediaquality.com/wiki/Using_Wikipedia_Knowledge_to_Improve_Text_Classification">Using Wikipedia Knowledge to Improve Text Classification</a>". Springer-Verlag. DOI: 10.1007/s10115-008-0152-4.

@@ Line 1: / Line 1: @@
-'''Using Wikipedia Knowledge to Improve Text Classification''' - scientific work related to Wikipedia quality published in 2009, written by Pu Wang, Jian Hu, Hua-Jun Zeng and Zheng Chen.
+{{Infobox work
+| title = Using Wikipedia Knowledge to Improve Text Classification
+| date = 2009
+| authors = [[Pu Wang]]<br />[[Jian Hu]]<br />[[Hua-Jun Zeng]]<br />[[Zheng Chen]]
+| doi = 10.1007/s10115-008-0152-4
+| link = https://link.springer.com/content/pdf/10.1007%2Fs10115-008-0152-4.pdf?origin=publication_detail
+}}
+'''Using Wikipedia Knowledge to Improve Text Classification''' - scientific work related to [[Wikipedia quality]] published in 2009, written by [[Pu Wang]], [[Jian Hu]], [[Hua-Jun Zeng]] and [[Zheng Chen]].
 == Overview ==
-Text classification has been widely used to assist users with the discovery of useful information from the Internet. However, traditional classification methods are based on the “Bag of Words” (BOW) representation, which only accounts for term frequency in the documents, and ignores important semantic relationships between key terms. To overcome this problem, previous work attempted to enrich text representation by means of manual intervention or automatic document expansion. The achieved improvement is unfortunately very limited, due to the poor coverage capability of the dictionary, and to the ineffectiveness of term expansion. In this paper, authors automatically construct a thesaurus of concepts from Wikipedia. Authors then introduce a unified framework to expand the BOW representation with semantic relations (synonymy, hyponymy, and associative relations), and demonstrate its efficacy in enhancing previous approaches for text classification. Experimental results on several data sets show that the proposed approach, integrated with the thesaurus built from Wikipedia, can achieve significant improvements with respect to the baseline algorithm.
+Text classification has been widely used to assist users with the discovery of useful information from the Internet. However, traditional classification methods are based on the “Bag of Words” (BOW) representation, which only accounts for term frequency in the documents, and ignores important semantic relationships between key terms. To overcome this problem, previous work attempted to enrich text representation by means of manual intervention or automatic document expansion. The achieved improvement is unfortunately very limited, due to the poor coverage capability of the dictionary, and to the ineffectiveness of term expansion. In this paper, authors automatically construct a thesaurus of concepts from [[Wikipedia]]. Authors then introduce a unified framework to expand the BOW representation with semantic relations (synonymy, hyponymy, and associative relations), and demonstrate its efficacy in enhancing previous approaches for text classification. Experimental results on several data sets show that the proposed approach, integrated with the thesaurus built from Wikipedia, can achieve significant improvements with respect to the baseline algorithm.
+== Embed ==
+=== Wikipedia Quality ===
+<code>
+<nowiki>
+Wang, Pu; Hu, Jian; Zeng, Hua-Jun; Chen, Zheng. (2009). "[[Using Wikipedia Knowledge to Improve Text Classification]]". Springer-Verlag. DOI: 10.1007/s10115-008-0152-4.
+</nowiki>
+</code>
+=== English Wikipedia ===
+<code>
+<nowiki>
+{{cite journal |last1=Wang |first1=Pu |last2=Hu |first2=Jian |last3=Zeng |first3=Hua-Jun |last4=Chen |first4=Zheng |title=Using Wikipedia Knowledge to Improve Text Classification |date=2009 |doi=10.1007/s10115-008-0152-4 |url=https://wikipediaquality.com/wiki/Using_Wikipedia_Knowledge_to_Improve_Text_Classification |journal=Springer-Verlag}}
+</nowiki>
+</code>
+=== HTML ===
+<code>
+<nowiki>
+Wang, Pu; Hu, Jian; Zeng, Hua-Jun; Chen, Zheng. (2009). &amp;quot;<a href="https://wikipediaquality.com/wiki/Using_Wikipedia_Knowledge_to_Improve_Text_Classification">Using Wikipedia Knowledge to Improve Text Classification</a>&amp;quot;. Springer-Verlag. DOI: 10.1007/s10115-008-0152-4.
+</nowiki>
+</code>
+[[Category:Scientific works]]

Difference between revisions of "Using Wikipedia Knowledge to Improve Text Classification"

Latest revision as of 00:36, 9 February 2021

Contents

Overview

Embed

Wikipedia Quality

English Wikipedia

HTML

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools