Multilingual Schema Matching for Wikipedia Infoboxes

From Wikipedia Quality
Jump to: navigation, search


Multilingual Schema Matching for Wikipedia Infoboxes
Authors
Thanh Hoang Nguyen
Viviane Pereira Moreira
Huong Nguyen
Hoa Nguyen
Juliana Freire
Publication date
2011
DOI
10.14778/2078324.2078329
Links
Original Preprint

Multilingual Schema Matching for Wikipedia Infoboxes - scientific work related to Wikipedia quality published in 2011, written by Thanh Hoang Nguyen, Viviane Pereira Moreira, Huong Nguyen, Hoa Nguyen and Juliana Freire.

Overview

Recent research has taken advantage of Wikipedia's multi-lingualism as a resource for cross-language information retrieval and machine translation, as well as proposed techniques for enriching its cross-language structure. The availability of documents in multiple languages also opens up new opportunities for querying structured Wikipedia content, and in particular, to enable answers that straddle different languages. As a step towards supporting such queries, in this paper, authors propose a method for identifying mappings between attributes from infoboxes that come from pages in different languages. Authors approach finds mappings in a completely automated fashion. Because it does not require training data, it is scalable: not only can it be used to find mappings between many language pairs, but it is also effective for languages that are under-represented and lack sufficient training samples. Another important benefit of approach is that it does not depend on syntactic similarity between attribute names, and thus, it can be applied to language pairs that have distinct morphologies. Authors have performed an extensive experimental evaluation using a corpus consisting of pages in Portuguese, Vietnamese, and English. The results show that not only does approach obtain high precision and recall, but it also outperforms state-of-the-art techniques. Authors also present a case study which demonstrates that the multilingual mappings authors derive lead to substantial improvements in answer quality and coverage for structured queries over Wikipedia content.

Embed

Wikipedia Quality

Nguyen, Thanh Hoang; Moreira, Viviane Pereira; Nguyen, Huong; Nguyen, Hoa; Freire, Juliana. (2011). "[[Multilingual Schema Matching for Wikipedia Infoboxes]]". VLDB Endowment. DOI: 10.14778/2078324.2078329.

English Wikipedia

{{cite journal |last1=Nguyen |first1=Thanh Hoang |last2=Moreira |first2=Viviane Pereira |last3=Nguyen |first3=Huong |last4=Nguyen |first4=Hoa |last5=Freire |first5=Juliana |title=Multilingual Schema Matching for Wikipedia Infoboxes |date=2011 |doi=10.14778/2078324.2078329 |url=https://wikipediaquality.com/wiki/Multilingual_Schema_Matching_for_Wikipedia_Infoboxes |journal=VLDB Endowment}}

HTML

Nguyen, Thanh Hoang; Moreira, Viviane Pereira; Nguyen, Huong; Nguyen, Hoa; Freire, Juliana. (2011). &quot;<a href="https://wikipediaquality.com/wiki/Multilingual_Schema_Matching_for_Wikipedia_Infoboxes">Multilingual Schema Matching for Wikipedia Infoboxes</a>&quot;. VLDB Endowment. DOI: 10.14778/2078324.2078329.