Difference between revisions of "Clustering Tweets Usingwikipedia Concepts"
(wikilinks) |
(Infobox work) |
||
Line 1: | Line 1: | ||
+ | {{Infobox work | ||
+ | | title = Clustering Tweets Usingwikipedia Concepts | ||
+ | | date = 2014 | ||
+ | | authors = [[Guoyu Tang]]<br />[[Yunqing Xia]]<br />[[Weizhi Wang]]<br />[[Raymond Lau]]<br />[[Fang Zheng]] | ||
+ | | link = http://www.lrec-conf.org/proceedings/lrec2014/pdf/83_Paper.pdf | ||
+ | }} | ||
'''Clustering Tweets Usingwikipedia Concepts''' - scientific work related to [[Wikipedia quality]] published in 2014, written by [[Guoyu Tang]], [[Yunqing Xia]], [[Weizhi Wang]], [[Raymond Lau]] and [[Fang Zheng]]. | '''Clustering Tweets Usingwikipedia Concepts''' - scientific work related to [[Wikipedia quality]] published in 2014, written by [[Guoyu Tang]], [[Yunqing Xia]], [[Weizhi Wang]], [[Raymond Lau]] and [[Fang Zheng]]. | ||
== Overview == | == Overview == | ||
Two challenging issues are notable in tweet clustering. Firstly, the sparse data problem is serious since no tweet can be longer than 140 characters. Secondly, synonymy and polysemy are rather common because users intend to present a unique meaning with a great number of manners in tweets. Enlightened by the recent research which indicates [[Wikipedia]] is promising in representing text, authors exploit Wikipedia concepts in representing tweets with concept vectors. Authors address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections. To further alleviate the sparse data problem, authors further make use of three types of out-links in Wikipedia. Evaluation on a twitter dataset shows that the concept model outperforms the traditional VSM model in tweet clustering. | Two challenging issues are notable in tweet clustering. Firstly, the sparse data problem is serious since no tweet can be longer than 140 characters. Secondly, synonymy and polysemy are rather common because users intend to present a unique meaning with a great number of manners in tweets. Enlightened by the recent research which indicates [[Wikipedia]] is promising in representing text, authors exploit Wikipedia concepts in representing tweets with concept vectors. Authors address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections. To further alleviate the sparse data problem, authors further make use of three types of out-links in Wikipedia. Evaluation on a twitter dataset shows that the concept model outperforms the traditional VSM model in tweet clustering. |
Revision as of 18:27, 21 June 2019
Authors | Guoyu Tang Yunqing Xia Weizhi Wang Raymond Lau Fang Zheng |
---|---|
Publication date | 2014 |
Links | Original |
Clustering Tweets Usingwikipedia Concepts - scientific work related to Wikipedia quality published in 2014, written by Guoyu Tang, Yunqing Xia, Weizhi Wang, Raymond Lau and Fang Zheng.
Overview
Two challenging issues are notable in tweet clustering. Firstly, the sparse data problem is serious since no tweet can be longer than 140 characters. Secondly, synonymy and polysemy are rather common because users intend to present a unique meaning with a great number of manners in tweets. Enlightened by the recent research which indicates Wikipedia is promising in representing text, authors exploit Wikipedia concepts in representing tweets with concept vectors. Authors address the polysemy issue with a Bayesian model, and the synonymy issue by exploiting the Wikipedia redirections. To further alleviate the sparse data problem, authors further make use of three types of out-links in Wikipedia. Evaluation on a twitter dataset shows that the concept model outperforms the traditional VSM model in tweet clustering.