Difference between revisions of "Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(+ wikilinks)
(Infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia
 +
| date = 2008
 +
| authors = [[Jong-Hoon Oh]]<br />[[Daisuke Kawahara]]<br />[[Kiyotaka Uchimoto]]<br />[[Jun’ichi Kazama]]<br />[[Kentaro Torisawa]]
 +
| doi = 10.1109/WIIAT.2008.317
 +
| link = http://dl.acm.org/citation.cfm?id=1486927.1487058
 +
}}
 
'''Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2008, written by [[Jong-Hoon Oh]], [[Daisuke Kawahara]], [[Kiyotaka Uchimoto]], [[Jun’ichi Kazama]] and [[Kentaro Torisawa]].
 
'''Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia''' - scientific work related to [[Wikipedia quality]] published in 2008, written by [[Jong-Hoon Oh]], [[Daisuke Kawahara]], [[Kiyotaka Uchimoto]], [[Jun’ichi Kazama]] and [[Kentaro Torisawa]].
  
 
== Overview ==
 
== Overview ==
 
Authors present a novel method for discovering missing cross-language links between English and Japanese [[Wikipedia]] articles. Authors collect candidates of missing cross-language links -- a pair of English and Japanese Wikipedia articles, which could be connected by cross-language links. Then authors select the correct cross-language links among the candidates by using a classifier trained with various types of [[features]]. Authors method has three desirable characteristics for discovering missing links. First, method can discover cross-language links with high accuracy (92% precision with 78% recall rates). Second, the features used in a classifier are language-independent. Third, without relying on any external knowledge, authors generate the features based on resources automatically obtained from Wikipedia. In this work, authors discover approximately $10^5$ missing cross-language links from Wikipedia, which are almost two-thirds as many as the existing cross-language links in Wikipedia.
 
Authors present a novel method for discovering missing cross-language links between English and Japanese [[Wikipedia]] articles. Authors collect candidates of missing cross-language links -- a pair of English and Japanese Wikipedia articles, which could be connected by cross-language links. Then authors select the correct cross-language links among the candidates by using a classifier trained with various types of [[features]]. Authors method has three desirable characteristics for discovering missing links. First, method can discover cross-language links with high accuracy (92% precision with 78% recall rates). Second, the features used in a classifier are language-independent. Third, without relying on any external knowledge, authors generate the features based on resources automatically obtained from Wikipedia. In this work, authors discover approximately $10^5$ missing cross-language links from Wikipedia, which are almost two-thirds as many as the existing cross-language links in Wikipedia.

Revision as of 08:32, 13 February 2021


Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia
Authors
Jong-Hoon Oh
Daisuke Kawahara
Kiyotaka Uchimoto
Jun’ichi Kazama
Kentaro Torisawa
Publication date
2008
DOI
10.1109/WIIAT.2008.317
Links
Original

Enriching Multilingual Language Resources by Discovering Missing Cross-Language Links in Wikipedia - scientific work related to Wikipedia quality published in 2008, written by Jong-Hoon Oh, Daisuke Kawahara, Kiyotaka Uchimoto, Jun’ichi Kazama and Kentaro Torisawa.

Overview

Authors present a novel method for discovering missing cross-language links between English and Japanese Wikipedia articles. Authors collect candidates of missing cross-language links -- a pair of English and Japanese Wikipedia articles, which could be connected by cross-language links. Then authors select the correct cross-language links among the candidates by using a classifier trained with various types of features. Authors method has three desirable characteristics for discovering missing links. First, method can discover cross-language links with high accuracy (92% precision with 78% recall rates). Second, the features used in a classifier are language-independent. Third, without relying on any external knowledge, authors generate the features based on resources automatically obtained from Wikipedia. In this work, authors discover approximately $10^5$ missing cross-language links from Wikipedia, which are almost two-thirds as many as the existing cross-language links in Wikipedia.