Difference between revisions of "Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links"
(Infobox work) |
(Embed for English Wikipedia, HTML) |
||
Line 10: | Line 10: | ||
== Overview == | == Overview == | ||
In this chapter authors present a fast, accurate, and elegant metric to assess semantic [[relatedness]] among entities included in an hypertextual corpus building an novel language independent Vector Space Model. Such a technique is based upon the Jaccard similarity coefficient, approximated with the MinHash technique to generate a constant-size vector fingerprint for each entity in the considered corpus. This strategy allows evaluation of pairwise semantic relatedness in constant time, no matter how many entities are included in the data and how dense the internal link structure is. Being semantic relatedness a subtle and somewhat subjective matter, authors evaluated approach by running user tests on a crowdsourcing platform. To achieve a better evaluation authors considered two collaboratively built corpora: the [[English Wikipedia]] and the Italian [[Wikipedia]], which differ significantly in size, topology, and user base. The evaluation suggests that the proposed technique is able to generate satisfactory results, outperforming commercial baseline systems regardless of the employed data and the cultural differences of the considered test users. | In this chapter authors present a fast, accurate, and elegant metric to assess semantic [[relatedness]] among entities included in an hypertextual corpus building an novel language independent Vector Space Model. Such a technique is based upon the Jaccard similarity coefficient, approximated with the MinHash technique to generate a constant-size vector fingerprint for each entity in the considered corpus. This strategy allows evaluation of pairwise semantic relatedness in constant time, no matter how many entities are included in the data and how dense the internal link structure is. Being semantic relatedness a subtle and somewhat subjective matter, authors evaluated approach by running user tests on a crowdsourcing platform. To achieve a better evaluation authors considered two collaboratively built corpora: the [[English Wikipedia]] and the Italian [[Wikipedia]], which differ significantly in size, topology, and user base. The evaluation suggests that the proposed technique is able to generate satisfactory results, outperforming commercial baseline systems regardless of the employed data and the cultural differences of the considered test users. | ||
+ | |||
+ | == Embed == | ||
+ | === Wikipedia Quality === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). "[[Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links]]". Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === English Wikipedia === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | {{cite journal |last1=Degl’Innocenti |first1=Dante |last2=Nart |first2=Dario De |last3=Helmy |first3=Muhammad |last4=Tasso |first4=Carlo |title=Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links |date=2018 |doi=10.1007/978-3-319-67056-0_27 |url=https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links |journal=Springer, Cham}} | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === HTML === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). &quot;<a href="https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links">Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links</a>&quot;. Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27. | ||
+ | </nowiki> | ||
+ | </code> |
Revision as of 08:36, 18 May 2020
Authors | Dante Degl’Innocenti Dario De Nart Muhammad Helmy Carlo Tasso |
---|---|
Publication date | 2018 |
DOI | 10.1007/978-3-319-67056-0_27 |
Links | Original |
Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links - scientific work related to Wikipedia quality published in 2018, written by Dante Degl’Innocenti, Dario De Nart, Muhammad Helmy and Carlo Tasso.
Overview
In this chapter authors present a fast, accurate, and elegant metric to assess semantic relatedness among entities included in an hypertextual corpus building an novel language independent Vector Space Model. Such a technique is based upon the Jaccard similarity coefficient, approximated with the MinHash technique to generate a constant-size vector fingerprint for each entity in the considered corpus. This strategy allows evaluation of pairwise semantic relatedness in constant time, no matter how many entities are included in the data and how dense the internal link structure is. Being semantic relatedness a subtle and somewhat subjective matter, authors evaluated approach by running user tests on a crowdsourcing platform. To achieve a better evaluation authors considered two collaboratively built corpora: the English Wikipedia and the Italian Wikipedia, which differ significantly in size, topology, and user base. The evaluation suggests that the proposed technique is able to generate satisfactory results, outperforming commercial baseline systems regardless of the employed data and the cultural differences of the considered test users.
Embed
Wikipedia Quality
Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). "[[Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links]]". Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.
English Wikipedia
{{cite journal |last1=Degl’Innocenti |first1=Dante |last2=Nart |first2=Dario De |last3=Helmy |first3=Muhammad |last4=Tasso |first4=Carlo |title=Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links |date=2018 |doi=10.1007/978-3-319-67056-0_27 |url=https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links |journal=Springer, Cham}}
HTML
Degl’Innocenti, Dante; Nart, Dario De; Helmy, Muhammad; Tasso, Carlo. (2018). "<a href="https://wikipediaquality.com/wiki/Fast,_Accurate,_Multilingual_Semantic_Relatedness_Measurement_Using_Wikipedia_Links">Fast, Accurate, Multilingual Semantic Relatedness Measurement Using Wikipedia Links</a>". Springer, Cham. DOI: 10.1007/978-3-319-67056-0_27.