Difference between revisions of "Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering"

From Wikipedia Quality
Jump to: navigation, search
(Adding new article - Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering)
 
(Links)
Line 1: Line 1:
'''Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering''' - scientific work related to Wikipedia quality published in 2011, written by Jennifer Chu-Carroll and James Fan.
+
'''Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering''' - scientific work related to [[Wikipedia quality]] published in 2011, written by [[Jennifer Chu-Carroll]] and [[James Fan]].
  
 
== Overview ==
 
== Overview ==
Most existing Question Answering (QA) systems adopt a type-and-generate approach to candidate generation that relies on a pre-defined domain ontology. This paper describes a type independent search and candidate generation paradigm for QA that leverages Wikipedia characteristics. This approach is particularly useful for adapting QA systems to domains where reliable answer type identification and type-based answer extraction are not available. Authors present a three-pronged search approach motivated by relations an answer-justifying title-oriented document may have with the question/answer pair. Authors further show how Wikipedia metadata such as anchor texts and redirects can be utilized to effectively extract candidate answers from search results without a type ontology. Authors experimental results show that strategies obtained high binary recall in both search and candidate generation on TREC questions, a domain that has mature answer type extraction technology, as well as on Jeopardy! questions, a domain without such technology. Authors high-recall search and candidate generation approach has also led to high overall QA performance in Watson, end-to-end system.
+
Most existing Question Answering (QA) systems adopt a type-and-generate approach to candidate generation that relies on a pre-defined domain [[ontology]]. This paper describes a type independent search and candidate generation paradigm for QA that leverages [[Wikipedia]] characteristics. This approach is particularly useful for adapting QA systems to domains where reliable answer type identification and type-based answer extraction are not available. Authors present a three-pronged search approach motivated by relations an answer-justifying title-oriented document may have with the question/answer pair. Authors further show how Wikipedia metadata such as anchor texts and redirects can be utilized to effectively extract candidate answers from search results without a type ontology. Authors experimental results show that strategies obtained high binary recall in both search and candidate generation on TREC questions, a domain that has mature answer type extraction technology, as well as on Jeopardy! questions, a domain without such technology. Authors high-recall search and candidate generation approach has also led to high overall QA performance in Watson, end-to-end system.

Revision as of 23:30, 6 July 2019

Leveraging Wikipedia Characteristics for Search and Candidate Generation in Question Answering - scientific work related to Wikipedia quality published in 2011, written by Jennifer Chu-Carroll and James Fan.

Overview

Most existing Question Answering (QA) systems adopt a type-and-generate approach to candidate generation that relies on a pre-defined domain ontology. This paper describes a type independent search and candidate generation paradigm for QA that leverages Wikipedia characteristics. This approach is particularly useful for adapting QA systems to domains where reliable answer type identification and type-based answer extraction are not available. Authors present a three-pronged search approach motivated by relations an answer-justifying title-oriented document may have with the question/answer pair. Authors further show how Wikipedia metadata such as anchor texts and redirects can be utilized to effectively extract candidate answers from search results without a type ontology. Authors experimental results show that strategies obtained high binary recall in both search and candidate generation on TREC questions, a domain that has mature answer type extraction technology, as well as on Jeopardy! questions, a domain without such technology. Authors high-recall search and candidate generation approach has also led to high overall QA performance in Watson, end-to-end system.