Difference between revisions of "Clustering Tweets Usingwikipedia Concepts"

Revision as of 18:27, 21 June 2019

Clustering Tweets Usingwikipedia Concepts
Authors	Guoyu Tang Yunqing Xia Weizhi Wang Raymond Lau Fang Zheng
Publication date	2014
Links	Original

Clustering Tweets Usingwikipedia Concepts - scientific work related to Wikipedia quality published in 2014, written by Guoyu Tang, Yunqing Xia, Weizhi Wang, Raymond Lau and Fang Zheng.

Overview

Two challenging issues are notable in tweet clustering. Firstly, the sparse data problem is serious since no tweet can be longer than 140 characters. Secondly, synonymy and polysemy are rather common because users intend to present a unique meaning with a great number of manners in tweets. Enlightened by the recent research which indicates Wikipedia is promising in representing text, authors exploit Wikipedia concepts in representing tweets with concept vectors. Authors address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections. To further alleviate the sparse data problem, authors further make use of three types of out-links in Wikipedia. Evaluation on a twitter dataset shows that the concept model outperforms the traditional VSM model in tweet clustering.

@@ Line 1: / Line 1: @@
+{{Infobox work
+| title = Clustering Tweets Usingwikipedia Concepts
+| date = 2014
+| authors = [[Guoyu Tang]]<br />[[Yunqing Xia]]<br />[[Weizhi Wang]]<br />[[Raymond Lau]]<br />[[Fang Zheng]]
+| link = http://www.lrec-conf.org/proceedings/lrec2014/pdf/83_Paper.pdf
+}}
 '''Clustering Tweets Usingwikipedia Concepts''' - scientific work related to [[Wikipedia quality]] published in 2014, written by [[Guoyu Tang]], [[Yunqing Xia]], [[Weizhi Wang]], [[Raymond Lau]] and [[Fang Zheng]].
 == Overview ==
 Two challenging issues are notable in tweet clustering. Firstly, the sparse data problem is serious since no tweet can be longer than 140 characters. Secondly, synonymy and polysemy are rather common because users intend to present a unique meaning with a great number of manners in tweets. Enlightened by the recent research which indicates [[Wikipedia]] is promising in representing text, authors exploit Wikipedia concepts in representing tweets with concept vectors. Authors address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections. To further alleviate the sparse data problem, authors further make use of three types of out-links in Wikipedia. Evaluation on a twitter dataset shows that the concept model outperforms the traditional VSM model in tweet clustering.

Difference between revisions of "Clustering Tweets Usingwikipedia Concepts"

Revision as of 18:27, 21 June 2019

Overview

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools