A Textual Approach based on Passages Using Ir-N in Wikipediamm Task 2008

From Wikipedia Quality
Revision as of 10:46, 15 June 2019 by Brooklyn (talk | contribs) (Overview: A Textual Approach based on Passages Using Ir-N in Wikipediamm Task 2008)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

A Textual Approach based on Passages Using Ir-N in Wikipediamm Task 2008 - scientific work related to Wikipedia quality published in 2008, written by Sergio Navarro, Rafael Muñoz and Fernando Llopis.

Overview

In this paper authors have focused efforts on comparing the behaviour of two relevance feedback methods in this task - LCA and PRF - and in checking if passage based information rerieval (IR) system is useful in a competition with small sized documents. Furthermore authors have added an adaptation to this domain based on decompound in single terms those file names which use a Camel Case notation. Authors base decision on the belief that the most meaningful information of an image file appointed by a human is on the file name itself. Thus, it is important to make visible this terms when they are hidden in a compounded file name. Finally authors have added a geographical query expansion and a visual concept expansion. Authors have obtained a 29th place within a total of 77 runs with baseline run - which only used the passage IR system -, and a 3rd place obtained with best run - which used the passage IR system with Camel Case decompounding -. It shows us on one hand the usefulness of passage based IR system in this domain, and on the other hand it confirms belief in the existence of specially meaningful information within the file names. In the the relevance feedback respect, authors have obtained contradictory results about the suitability of LCA or PRF to the task, but authors have found that LCA has a more robust behavior than PRF.