Difference between revisions of "Performing Cross-Language Retrieval with Wikipedia"

From Wikipedia Quality
Jump to: navigation, search
(+ Infobox work)
(Embed for English Wikipedia, HTML)
Line 9: Line 9:
 
== Overview ==
 
== Overview ==
 
Authors describe a method which is able to translate queries extended by narrative information from one language to another, with help of an appropriate machine readable dictionary and the [[Wikipedia]] on-line encyclopedia. Processing occurs in three steps: rst, authors look up possible translations phrase by phrase using both the dictionary and the [[cross-lingual]] links provided by Wikipedia; second, improbable translations, detected by a simple language model computed over a large corpus of documents written in the target language, are eliminated; and nally, further ltering is applied by matching Wikipedia concepts against the query narrative and removing translations not related to the overall query topic. Experiments performed on the Los Angeles Times 2002 corpus, translating from Hungarian to English showed that while queries generated at end of the second step were roughly only half as e ective as original queries, primarily due to the limitations of tools, after the third step precision improved signi cantly, reaching 60% of the native English level.
 
Authors describe a method which is able to translate queries extended by narrative information from one language to another, with help of an appropriate machine readable dictionary and the [[Wikipedia]] on-line encyclopedia. Processing occurs in three steps: rst, authors look up possible translations phrase by phrase using both the dictionary and the [[cross-lingual]] links provided by Wikipedia; second, improbable translations, detected by a simple language model computed over a large corpus of documents written in the target language, are eliminated; and nally, further ltering is applied by matching Wikipedia concepts against the query narrative and removing translations not related to the overall query topic. Experiments performed on the Los Angeles Times 2002 corpus, translating from Hungarian to English showed that while queries generated at end of the second step were roughly only half as e ective as original queries, primarily due to the limitations of tools, after the third step precision improved signi cantly, reaching 60% of the native English level.
 +
 +
== Embed ==
 +
=== Wikipedia Quality ===
 +
<code>
 +
<nowiki>
 +
Schönhofen, Péter; Benczúr, András A.; Bíró, István; Csalogány, Károly. (2007). "[[Performing Cross-Language Retrieval with Wikipedia]]".
 +
</nowiki>
 +
</code>
 +
 +
=== English Wikipedia ===
 +
<code>
 +
<nowiki>
 +
{{cite journal |last1=Schönhofen |first1=Péter |last2=Benczúr |first2=András A. |last3=Bíró |first3=István |last4=Csalogány |first4=Károly |title=Performing Cross-Language Retrieval with Wikipedia |date=2007 |url=https://wikipediaquality.com/wiki/Performing_Cross-Language_Retrieval_with_Wikipedia}}
 +
</nowiki>
 +
</code>
 +
 +
=== HTML ===
 +
<code>
 +
<nowiki>
 +
Schönhofen, Péter; Benczúr, András A.; Bíró, István; Csalogány, Károly. (2007). &amp;quot;<a href="https://wikipediaquality.com/wiki/Performing_Cross-Language_Retrieval_with_Wikipedia">Performing Cross-Language Retrieval with Wikipedia</a>&amp;quot;.
 +
</nowiki>
 +
</code>

Revision as of 07:29, 13 June 2020


Performing Cross-Language Retrieval with Wikipedia
Authors
Péter Schönhofen
András A. Benczúr
István Bíró
Károly Csalogány
Publication date
2007
Links
Original

Performing Cross-Language Retrieval with Wikipedia - scientific work related to Wikipedia quality published in 2007, written by Péter Schönhofen, András A. Benczúr, István Bíró and Károly Csalogány.

Overview

Authors describe a method which is able to translate queries extended by narrative information from one language to another, with help of an appropriate machine readable dictionary and the Wikipedia on-line encyclopedia. Processing occurs in three steps: rst, authors look up possible translations phrase by phrase using both the dictionary and the cross-lingual links provided by Wikipedia; second, improbable translations, detected by a simple language model computed over a large corpus of documents written in the target language, are eliminated; and nally, further ltering is applied by matching Wikipedia concepts against the query narrative and removing translations not related to the overall query topic. Experiments performed on the Los Angeles Times 2002 corpus, translating from Hungarian to English showed that while queries generated at end of the second step were roughly only half as e ective as original queries, primarily due to the limitations of tools, after the third step precision improved signi cantly, reaching 60% of the native English level.

Embed

Wikipedia Quality

Schönhofen, Péter; Benczúr, András A.; Bíró, István; Csalogány, Károly. (2007). "[[Performing Cross-Language Retrieval with Wikipedia]]".

English Wikipedia

{{cite journal |last1=Schönhofen |first1=Péter |last2=Benczúr |first2=András A. |last3=Bíró |first3=István |last4=Csalogány |first4=Károly |title=Performing Cross-Language Retrieval with Wikipedia |date=2007 |url=https://wikipediaquality.com/wiki/Performing_Cross-Language_Retrieval_with_Wikipedia}}

HTML

Schönhofen, Péter; Benczúr, András A.; Bíró, István; Csalogány, Károly. (2007). &quot;<a href="https://wikipediaquality.com/wiki/Performing_Cross-Language_Retrieval_with_Wikipedia">Performing Cross-Language Retrieval with Wikipedia</a>&quot;.