Difference between revisions of "Wikipedia Ad Hoc Passage Retrieval and Wikipedia Document Linking"

From Wikipedia Quality
Jump to: navigation, search
(Adding new article - Wikipedia Ad Hoc Passage Retrieval and Wikipedia Document Linking)
 
(Wikilinks)
Line 1: Line 1:
'''Wikipedia Ad Hoc Passage Retrieval and Wikipedia Document Linking''' - scientific work related to Wikipedia quality published in 2008, written by Dylan Jenkinson and Andrew Trotman.
+
'''Wikipedia Ad Hoc Passage Retrieval and Wikipedia Document Linking''' - scientific work related to [[Wikipedia quality]] published in 2008, written by [[Dylan Jenkinson]] and [[Andrew Trotman]].
  
 
== Overview ==
 
== Overview ==
Ad hoc passage retrieval within the Wikipedia is examined in the context of INEX 2007. An analysis of the INEX 2006 assessments suggests that fixed sized window of about 300 terms is consistently seen and that this might be a good retrieval strategy. In runs submitted to INEX, potentially relevant documents were identified using BM25 (trained on INEX 2006 data). For each potentially relevant document the location of every search term was identified and the center (mean) located. A fixed sized window was then centered on this location. A method of removing outliers was examined in which all terms occurring outside one standard deviation of the center were considered outliers and the center recomputed without them. Both techniques were examined with and without stemming.
+
Ad hoc passage retrieval within the [[Wikipedia]] is examined in the context of INEX 2007. An analysis of the INEX 2006 assessments suggests that fixed sized window of about 300 terms is consistently seen and that this might be a good retrieval strategy. In runs submitted to INEX, potentially relevant documents were identified using BM25 (trained on INEX 2006 data). For each potentially relevant document the location of every search term was identified and the center (mean) located. A fixed sized window was then centered on this location. A method of removing outliers was examined in which all terms occurring outside one standard deviation of the center were considered outliers and the center recomputed without them. Both techniques were examined with and without stemming.

Revision as of 07:39, 23 October 2020

Wikipedia Ad Hoc Passage Retrieval and Wikipedia Document Linking - scientific work related to Wikipedia quality published in 2008, written by Dylan Jenkinson and Andrew Trotman.

Overview

Ad hoc passage retrieval within the Wikipedia is examined in the context of INEX 2007. An analysis of the INEX 2006 assessments suggests that fixed sized window of about 300 terms is consistently seen and that this might be a good retrieval strategy. In runs submitted to INEX, potentially relevant documents were identified using BM25 (trained on INEX 2006 data). For each potentially relevant document the location of every search term was identified and the center (mean) located. A fixed sized window was then centered on this location. A method of removing outliers was examined in which all terms occurring outside one standard deviation of the center were considered outliers and the center recomputed without them. Both techniques were examined with and without stemming.