Difference between revisions of "Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings"

From Wikipedia Quality
Jump to: navigation, search
(+ Embed)
(Adding categories)
 
Line 32: Line 32:
 
</nowiki>
 
</nowiki>
 
</code>
 
</code>
 +
 +
 +
 +
[[Category:Scientific works]]

Latest revision as of 14:58, 22 December 2019


Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings
Authors
Luchen Tan
Haotian Zhang
Charles L. A. Clarke
Mark D. Smucker
Publication date
2015
DOI
10.3115/v1/P15-2108
Links
Original

Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings - scientific work related to Wikipedia quality published in 2015, written by Luchen Tan, Haotian Zhang, Charles L. A. Clarke and Mark D. Smucker.

Overview

Compared with carefully edited prose, the language of social media is informal in the extreme. The application of NLP techniques in this context may require a better understanding of word usage within social media. In this paper, authors compute a word embedding for a corpus of tweets, comparing it to a word embedding for Wikipedia. After learning a transformation of one vector space to the other, and adjusting similarity values according to term frequency, authors identify words whose usage differs greatly between the two corpora. For any given word, the set of words closest to it in a particular embedding provides a characterization for that word’s usage within the corresponding corpora.

Embed

Wikipedia Quality

Tan, Luchen; Zhang, Haotian; Clarke, Charles L. A.; Smucker, Mark D.. (2015). "[[Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings]]".DOI: 10.3115/v1/P15-2108.

English Wikipedia

{{cite journal |last1=Tan |first1=Luchen |last2=Zhang |first2=Haotian |last3=Clarke |first3=Charles L. A. |last4=Smucker |first4=Mark D. |title=Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings |date=2015 |doi=10.3115/v1/P15-2108 |url=https://wikipediaquality.com/wiki/Lexical_Comparison_Between_Wikipedia_and_Twitter_Corpora_by_Using_Word_Embeddings}}

HTML

Tan, Luchen; Zhang, Haotian; Clarke, Charles L. A.; Smucker, Mark D.. (2015). &quot;<a href="https://wikipediaquality.com/wiki/Lexical_Comparison_Between_Wikipedia_and_Twitter_Corpora_by_Using_Word_Embeddings">Lexical Comparison Between Wikipedia and Twitter Corpora by Using Word Embeddings</a>&quot;.DOI: 10.3115/v1/P15-2108.