Difference between revisions of "Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links"

From Wikipedia Quality
Jump to: navigation, search
(Infobox work)
(Embed for English Wikipedia, HTML)
Line 10: Line 10:
 
== Overview ==
 
== Overview ==
 
In this chapter authors present a fast, accurate, and elegant metric to assess semantic [[relatedness]] among entities included in an hypertextual corpus building an novel language independent Vector Space Model. Such a technique is based upon the Jaccard similarity coefficient, approximated with the MinHash technique to generate a constant-size vector fingerprint for each entity in the considered corpus. This strategy allows evaluation of pairwise semantic relatedness in constant time, no matter how many entities are included in the data and how dense the internal link structure is. Being semantic relatedness a subtle and somewhat subjective matter, authors evaluated approach by running user tests on a crowdsourcing platform. To achieve a better evaluation authors considered two collaboratively built corpora: the [[English Wikipedia]] and the Italian [[Wikipedia]], which differ significantly in size, topology, and user base. The evaluation suggests that the proposed technique is able to generate satisfactory results, outperforming commercial baseline systems regardless of the employed data and the cultural differences of the considered test users.
 
In this chapter authors present a fast, accurate, and elegant metric to assess semantic [[relatedness]] among entities included in an hypertextual corpus building an novel language independent Vector Space Model. Such a technique is based upon the Jaccard similarity coefficient, approximated with the MinHash technique to generate a constant-size vector fingerprint for each entity in the considered corpus. This strategy allows evaluation of pairwise semantic relatedness in constant time, no matter how many entities are included in the data and how dense the internal link structure is. Being semantic relatedness a subtle and somewhat subjective matter, authors evaluated approach by running user tests on a crowdsourcing platform. To achieve a better evaluation authors considered two collaboratively built corpora: the [[English Wikipedia]] and the Italian [[Wikipedia]], which differ significantly in size, topology, and user base. The evaluation suggests that the proposed technique is able to generate satisfactory results, outperforming commercial baseline systems regardless of the employed data and the cultural differences of the considered test users.
 +
 +
== Embed ==
 +
=== Wikipedia Quality ===
 +
<code>
 +
<nowiki>
 +
Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). "[[Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links]]". Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.
 +
</nowiki>
 +
</code>
 +
 +
=== English Wikipedia ===
 +
<code>
 +
<nowiki>
 +
{{cite journal |last1=Degl’Innocenti |first1=Dante |last2=Nart |first2=Dario De |last3=Helmy |first3=Muhammad |last4=Tasso |first4=Carlo |title=Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links |date=2018 |doi=10.1007/978-3-319-67056-0_27 |url=https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links |journal=Springer, Cham}}
 +
</nowiki>
 +
</code>
 +
 +
=== HTML ===
 +
<code>
 +
<nowiki>
 +
Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). &amp;quot;<a href="https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links">Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links</a>&amp;quot;. Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.
 +
</nowiki>
 +
</code>

Revision as of 08:36, 18 May 2020


Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links
Authors
Dante Degl’Innocenti
Dario De Nart
Muhammad Helmy
Carlo Tasso
Publication date
2018
DOI
10.1007/978-3-319-67056-0_27
Links
Original

Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links - scientific work related to Wikipedia quality published in 2018, written by Dante Degl’Innocenti, Dario De Nart, Muhammad Helmy and Carlo Tasso.

Overview

In this chapter authors present a fast, accurate, and elegant metric to assess semantic relatedness among entities included in an hypertextual corpus building an novel language independent Vector Space Model. Such a technique is based upon the Jaccard similarity coefficient, approximated with the MinHash technique to generate a constant-size vector fingerprint for each entity in the considered corpus. This strategy allows evaluation of pairwise semantic relatedness in constant time, no matter how many entities are included in the data and how dense the internal link structure is. Being semantic relatedness a subtle and somewhat subjective matter, authors evaluated approach by running user tests on a crowdsourcing platform. To achieve a better evaluation authors considered two collaboratively built corpora: the English Wikipedia and the Italian Wikipedia, which differ significantly in size, topology, and user base. The evaluation suggests that the proposed technique is able to generate satisfactory results, outperforming commercial baseline systems regardless of the employed data and the cultural differences of the considered test users.

Embed

Wikipedia Quality

Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). "[[Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links]]". Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.

English Wikipedia

{{cite journal |last1=Degl’Innocenti |first1=Dante |last2=Nart |first2=Dario De |last3=Helmy |first3=Muhammad |last4=Tasso |first4=Carlo |title=Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links |date=2018 |doi=10.1007/978-3-319-67056-0_27 |url=https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links |journal=Springer, Cham}}

HTML

Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). &quot;<a href="https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links">Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links</a>&quot;. Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.