Difference between revisions of "Multilingual Schema Matching for Wikipedia Infoboxes"

Revision as of 09:57, 7 July 2020

Multilingual Schema Matching for Wikipedia Infoboxes
Authors	Thanh Hoang Nguyen Viviane Pereira Moreira Huong Nguyen Hoa Nguyen Juliana Freire
Publication date	2011
DOI	10.14778/2078324.2078329
Links	Original Preprint

Multilingual Schema Matching for Wikipedia Infoboxes - scientific work related to Wikipedia quality published in 2011, written by Thanh Hoang Nguyen, Viviane Pereira Moreira, Huong Nguyen, Hoa Nguyen and Juliana Freire.

Overview

Recent research has taken advantage of Wikipedia's multi-lingualism as a resource for cross-language information retrieval and machine translation, as well as proposed techniques for enriching its cross-language structure. The availability of documents in multiple languages also opens up new opportunities for querying structured Wikipedia content, and in particular, to enable answers that straddle different languages. As a step towards supporting such queries, in this paper, authors propose a method for identifying mappings between attributes from infoboxes that come from pages in different languages. Authors approach finds mappings in a completely automated fashion. Because it does not require training data, it is scalable: not only can it be used to find mappings between many language pairs, but it is also effective for languages that are under-represented and lack sufficient training samples. Another important benefit of approach is that it does not depend on syntactic similarity between attribute names, and thus, it can be applied to language pairs that have distinct morphologies. Authors have performed an extensive experimental evaluation using a corpus consisting of pages in Portuguese, Vietnamese, and English. The results show that not only does approach obtain high precision and recall, but it also outperforms state-of-the-art techniques. Authors also present a case study which demonstrates that the multilingual mappings authors derive lead to substantial improvements in answer quality and coverage for structured queries over Wikipedia content.

@@ Line 1: / Line 1: @@
+{{Infobox work
+| title = Multilingual Schema Matching for Wikipedia Infoboxes
+| date = 2011
+| authors = [[Thanh Hoang Nguyen]]<br />[[Viviane Pereira Moreira]]<br />[[Huong Nguyen]]<br />[[Hoa Nguyen]]<br />[[Juliana Freire]]
+| doi = 10.14778/2078324.2078329
+| link = http://dl.acm.org/citation.cfm?doid=2078324.2078329
+| plink = https://arxiv.org/pdf/1110.6651
+}}
 '''Multilingual Schema Matching for Wikipedia Infoboxes''' - scientific work related to [[Wikipedia quality]] published in 2011, written by [[Thanh Hoang Nguyen]], [[Viviane Pereira Moreira]], [[Huong Nguyen]], [[Hoa Nguyen]] and [[Juliana Freire]].
 == Overview ==
 Recent research has taken advantage of [[Wikipedia]]'s multi-lingualism as a resource for cross-language [[information retrieval]] and [[machine translation]], as well as proposed techniques for enriching its cross-language structure. The availability of documents in [[multiple languages]] also opens up new opportunities for querying structured Wikipedia content, and in particular, to enable answers that straddle [[different language]]s. As a step towards supporting such queries, in this paper, authors propose a method for identifying mappings between attributes from [[infoboxes]] that come from pages in different languages. Authors approach finds mappings in a completely automated fashion. Because it does not require training data, it is scalable: not only can it be used to find mappings between many language pairs, but it is also effective for languages that are under-represented and lack sufficient training samples. Another important benefit of approach is that it does not depend on syntactic similarity between attribute names, and thus, it can be applied to language pairs that have distinct morphologies. Authors have performed an extensive experimental evaluation using a corpus consisting of pages in Portuguese, Vietnamese, and English. The results show that not only does approach obtain high precision and recall, but it also outperforms state-of-the-art techniques. Authors also present a case study which demonstrates that the [[multilingual]] mappings authors derive lead to substantial improvements in answer quality and coverage for structured queries over Wikipedia content.

Difference between revisions of "Multilingual Schema Matching for Wikipedia Infoboxes"

Revision as of 09:57, 7 July 2020

Overview

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools