Difference between revisions of "Link Analysis of Wikipedia Documents Using Mapreduce"
(+ Infobox work) |
(Embed for English Wikipedia, HTML) |
||
Line 10: | Line 10: | ||
== Overview == | == Overview == | ||
Wikipedia, a collaborative and user driven encyclopedia is considered to be the largest content thesaurus on the web, expanding into a massive database housing a huge amount of information. In this paper, authors present the design and implementation of a MapReduce-based [[Wikipedia]] link analysis system that provides a hierarchical examination of document connectivity in Wikipedia and captures the semantic relationships between the articles. Authors system consists of a Wikipedia crawler, a MapReduce-based distributed parser and the link analysis techniques. The results produced by this study are then modelled to the web Key Performance Indicators (KPIs) for link-structure interpretation. Authors find that Wikipedia has a remarkable capability as a corpus for content correlation with respect to connectivity among articles. Link Analysis and Semantic Structuration of Wikipedia not only provides an ergonomic report of tire-based link hierarchy of Wikipedia articles but also reflects the general cognition on semantic relationship between them. The results of analysis are aimed at providing valuable insights on evaluating the accuracy and the content scalability of Wikipedia through its link schematics. | Wikipedia, a collaborative and user driven encyclopedia is considered to be the largest content thesaurus on the web, expanding into a massive database housing a huge amount of information. In this paper, authors present the design and implementation of a MapReduce-based [[Wikipedia]] link analysis system that provides a hierarchical examination of document connectivity in Wikipedia and captures the semantic relationships between the articles. Authors system consists of a Wikipedia crawler, a MapReduce-based distributed parser and the link analysis techniques. The results produced by this study are then modelled to the web Key Performance Indicators (KPIs) for link-structure interpretation. Authors find that Wikipedia has a remarkable capability as a corpus for content correlation with respect to connectivity among articles. Link Analysis and Semantic Structuration of Wikipedia not only provides an ergonomic report of tire-based link hierarchy of Wikipedia articles but also reflects the general cognition on semantic relationship between them. The results of analysis are aimed at providing valuable insights on evaluating the accuracy and the content scalability of Wikipedia through its link schematics. | ||
+ | |||
+ | == Embed == | ||
+ | === Wikipedia Quality === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Hardik, Vasa; Anirudh, Vasudevan; Balaji, Palanisamy. (2015). "[[Link Analysis of Wikipedia Documents Using Mapreduce]]".DOI: 10.1109/IRI.2015.92. | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === English Wikipedia === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | {{cite journal |last1=Hardik |first1=Vasa |last2=Anirudh |first2=Vasudevan |last3=Balaji |first3=Palanisamy |title=Link Analysis of Wikipedia Documents Using Mapreduce |date=2015 |doi=10.1109/IRI.2015.92 |url=https://wikipediaquality.com/wiki/Link_Analysis_of_Wikipedia_Documents_Using_Mapreduce}} | ||
+ | </nowiki> | ||
+ | </code> | ||
+ | |||
+ | === HTML === | ||
+ | <code> | ||
+ | <nowiki> | ||
+ | Hardik, Vasa; Anirudh, Vasudevan; Balaji, Palanisamy. (2015). &quot;<a href="https://wikipediaquality.com/wiki/Link_Analysis_of_Wikipedia_Documents_Using_Mapreduce">Link Analysis of Wikipedia Documents Using Mapreduce</a>&quot;.DOI: 10.1109/IRI.2015.92. | ||
+ | </nowiki> | ||
+ | </code> |
Revision as of 08:58, 9 June 2020
Authors | Vasa Hardik Vasudevan Anirudh Palanisamy Balaji |
---|---|
Publication date | 2015 |
DOI | 10.1109/IRI.2015.92 |
Links | Original |
Link Analysis of Wikipedia Documents Using Mapreduce - scientific work related to Wikipedia quality published in 2015, written by Vasa Hardik, Vasudevan Anirudh and Palanisamy Balaji.
Overview
Wikipedia, a collaborative and user driven encyclopedia is considered to be the largest content thesaurus on the web, expanding into a massive database housing a huge amount of information. In this paper, authors present the design and implementation of a MapReduce-based Wikipedia link analysis system that provides a hierarchical examination of document connectivity in Wikipedia and captures the semantic relationships between the articles. Authors system consists of a Wikipedia crawler, a MapReduce-based distributed parser and the link analysis techniques. The results produced by this study are then modelled to the web Key Performance Indicators (KPIs) for link-structure interpretation. Authors find that Wikipedia has a remarkable capability as a corpus for content correlation with respect to connectivity among articles. Link Analysis and Semantic Structuration of Wikipedia not only provides an ergonomic report of tire-based link hierarchy of Wikipedia articles but also reflects the general cognition on semantic relationship between them. The results of analysis are aimed at providing valuable insights on evaluating the accuracy and the content scalability of Wikipedia through its link schematics.
Embed
Wikipedia Quality
Hardik, Vasa; Anirudh, Vasudevan; Balaji, Palanisamy. (2015). "[[Link Analysis of Wikipedia Documents Using Mapreduce]]".DOI: 10.1109/IRI.2015.92.
English Wikipedia
{{cite journal |last1=Hardik |first1=Vasa |last2=Anirudh |first2=Vasudevan |last3=Balaji |first3=Palanisamy |title=Link Analysis of Wikipedia Documents Using Mapreduce |date=2015 |doi=10.1109/IRI.2015.92 |url=https://wikipediaquality.com/wiki/Link_Analysis_of_Wikipedia_Documents_Using_Mapreduce}}
HTML
Hardik, Vasa; Anirudh, Vasudevan; Balaji, Palanisamy. (2015). "<a href="https://wikipediaquality.com/wiki/Link_Analysis_of_Wikipedia_Documents_Using_Mapreduce">Link Analysis of Wikipedia Documents Using Mapreduce</a>".DOI: 10.1109/IRI.2015.92.