Exploiting a Web-Based Encyclopedia as a Knowledge Base for the Extraction of Multilingual Terminology

From Wikipedia Quality
Jump to: navigation, search
Exploiting a Web-Based Encyclopedia as a Knowledge Base for the Extraction of Multilingual Terminology
Authors
Fatiha Sadat
Publication date
2012
ISSN
03029743
ISBN
978-364233982-0
DOI
10.1007/978-3-642-33983-7_9
Links

Exploiting a Web-Based Encyclopedia as a Knowledge Base for the Extraction of Multilingual Terminology - scientific work about Wikipedia quality published in 2012, written by Fatiha Sadat.

Overview

Multilingual linguistic resources are usually constructed from parallel corpora, but since these corpora are available only for selected text domains and language pairs, the potential of other resources is being explored as well. This article seeks to explore and to exploit the idea of using multilingual web-based encyclopaedias such as Wikipedia as comparable corpora for bilingual terminology extraction. Authors propose an approach to extract terms and their translations from different types of Wikipedia link information and data. The next step will be using linguistic-based information to re-rank and filter the extracted term candidates in the target language. Preliminary evaluations using the combined statistics-based and linguistic-based approaches were applied on different pairs of languages including Japanese, French and English. These evaluations showed a real open improvement and a good quality of the extracted term candidates for building or enriching multilingual anthologies, dictionaries or feeding a cross-language information retrieval system with the related expansion terms of the source query.

Embed

Wikipedia Quality

Sadat, Fatiha. (2012). "[[Exploiting a Web-Based Encyclopedia as a Knowledge Base for the Extraction of Multilingual Terminology]]". Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 7614 LNAI, 2012, pp. 88-96. ISBN: 978-364233982-0. ISSN: 03029743. DOI: 10.1007/978-3-642-33983-7_9.

English Wikipedia

{{cite journal |last1=Sadat |first1=Fatiha |title=Exploiting a Web-Based Encyclopedia as a Knowledge Base for the Extraction of Multilingual Terminology |date=2012 |isbn=978-364233982-0 |issn=03029743 |doi=10.1007/978-3-642-33983-7_9 |url=https://wikipediaquality.com/wiki/Exploiting_a_Web-Based_Encyclopedia_as_a_Knowledge_Base_for_the_Extraction_of_Multilingual_Terminology |journal=Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 7614 LNAI, 2012, pp. 88-96}}

HTML

Sadat, Fatiha. (2012). &quot;<a href="https://wikipediaquality.com/wiki/Exploiting_a_Web-Based_Encyclopedia_as_a_Knowledge_Base_for_the_Extraction_of_Multilingual_Terminology">Exploiting a Web-Based Encyclopedia as a Knowledge Base for the Extraction of Multilingual Terminology</a>&quot;. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 7614 LNAI, 2012, pp. 88-96. ISBN: 978-364233982-0. ISSN: 03029743. DOI: 10.1007/978-3-642-33983-7_9.