Difference between revisions of "Cross-Lingual Infobox Alignment in Wikipedia Using Entity-Attribute Factor Graph"

From Wikipedia Quality
Jump to: navigation, search
(Cross-Lingual Infobox Alignment in Wikipedia Using Entity-Attribute Factor Graph - new page)
 
(+ links)
Line 1: Line 1:
'''Cross-Lingual Infobox Alignment in Wikipedia Using Entity-Attribute Factor Graph''' - scientific work related to Wikipedia quality published in 2017, written by Yan Zhang, Thomas Paradis, Lei Hou, Juanzi Li, Jing Zhang and Hai-Tao Zheng.
+
'''Cross-Lingual Infobox Alignment in Wikipedia Using Entity-Attribute Factor Graph''' - scientific work related to [[Wikipedia quality]] published in 2017, written by [[Yan Zhang]], [[Thomas Paradis]], [[Lei Hou]], [[Juanzi Li]], [[Jing Zhang]] and [[Hai-Tao Zheng]].
  
 
== Overview ==
 
== Overview ==
Wikipedia infoboxes contain information about article entities in the form of attribute-value pairs, and are thus a very rich source of structured knowledge. However, as the different language versions of Wikipedia evolve independently, it is a promising but challenging problem to find correspondences between infobox attributes in different language editions. In this paper, authors propose 8 effective features for cross lingual infobox attribute matching containing categories, templates, attribute labels and values. Authors propose entity-attribute factor graph to consider not only individual features but also the correlations among attribute pairs. Experiments on the two Wikipedia data sets of English-Chinese and English-French show that proposed approach can achieve high F1-measure: 85.5% and 85.4% respectively on the two data sets. Authors proposed approach finds 23,923 new infobox attribute mappings between English and Chinese Wikipedia, and 31,576 between English and French based on no more than six thousand existing matched infobox attributes. Authors conduct an infobox completion experiment on English-Chinese Wikipedia and complement 76,498 (more than 30% of EN-ZH Wikipedia existing cross-lingual links) pairs of corresponding articles with more than one attribute-value pairs.
+
Wikipedia [[infoboxes]] contain information about article entities in the form of attribute-value pairs, and are thus a very rich source of structured knowledge. However, as the different [[language versions]] of [[Wikipedia]] evolve independently, it is a promising but challenging problem to find correspondences between infobox attributes in [[different language]] editions. In this paper, authors propose 8 effective [[features]] for [[cross lingual]] infobox attribute matching containing [[categories]], templates, attribute labels and values. Authors propose entity-attribute factor graph to consider not only individual features but also the correlations among attribute pairs. Experiments on the two Wikipedia data sets of English-Chinese and English-French show that proposed approach can achieve high F1-measure: 85.5% and 85.4% respectively on the two data sets. Authors proposed approach finds 23,923 new infobox attribute mappings between English and [[Chinese Wikipedia]], and 31,576 between English and French based on no more than six thousand existing matched infobox attributes. Authors conduct an infobox completion experiment on English-Chinese Wikipedia and complement 76,498 (more than 30% of EN-ZH Wikipedia existing [[cross-lingual]] links) pairs of corresponding articles with more than one attribute-value pairs.

Revision as of 23:44, 7 August 2019

Cross-Lingual Infobox Alignment in Wikipedia Using Entity-Attribute Factor Graph - scientific work related to Wikipedia quality published in 2017, written by Yan Zhang, Thomas Paradis, Lei Hou, Juanzi Li, Jing Zhang and Hai-Tao Zheng.

Overview

Wikipedia infoboxes contain information about article entities in the form of attribute-value pairs, and are thus a very rich source of structured knowledge. However, as the different language versions of Wikipedia evolve independently, it is a promising but challenging problem to find correspondences between infobox attributes in different language editions. In this paper, authors propose 8 effective features for cross lingual infobox attribute matching containing categories, templates, attribute labels and values. Authors propose entity-attribute factor graph to consider not only individual features but also the correlations among attribute pairs. Experiments on the two Wikipedia data sets of English-Chinese and English-French show that proposed approach can achieve high F1-measure: 85.5% and 85.4% respectively on the two data sets. Authors proposed approach finds 23,923 new infobox attribute mappings between English and Chinese Wikipedia, and 31,576 between English and French based on no more than six thousand existing matched infobox attributes. Authors conduct an infobox completion experiment on English-Chinese Wikipedia and complement 76,498 (more than 30% of EN-ZH Wikipedia existing cross-lingual links) pairs of corresponding articles with more than one attribute-value pairs.