Difference between revisions of "Extracting Named Entities and Synonyms from Wikipedia for Use in News Search"

From Wikipedia Quality
Jump to: navigation, search
(wikilinks)
(+ infobox)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Extracting Named Entities and Synonyms from Wikipedia for Use in News Search
 +
| date = 2008
 +
| authors = [[Christian Bøhn]]
 +
| link = https://brage.bibsys.no/xmlui/handle/11250/250696
 +
}}
 
'''Extracting Named Entities and Synonyms from Wikipedia for Use in News Search''' - scientific work related to [[Wikipedia quality]] published in 2008, written by [[Christian Bøhn]].
 
'''Extracting Named Entities and Synonyms from Wikipedia for Use in News Search''' - scientific work related to [[Wikipedia quality]] published in 2008, written by [[Christian Bøhn]].
  
 
== Overview ==
 
== Overview ==
 
In news articles the focus on [[named entities]] is quite common and usually a news case is tied around a person, a company, or similar. One challenge from an [[information retrieval]] point of view is that one entity often have more than one way of referring to it. This means that when users use news search engines they have to use the exact same name for the entity as the articles they are interested in use. Therefore the usage of synonyms to refer to the same entity forms the basis of this thesis. Authors explore the idea of using [[Wikipedia]] as a data source for building a large dictionary of named entities and their synonyms. An entity dictionary like that would be very interesting because it make it possible to link synonyms to the same entity. The evaluation shows that Wikipedia is well suited as a source of named entities and synonyms as the semi-structure aids in recognizing the entities and related synonyms. The use of the dictionary in a modified search solution shows on the other hand mixed results. On problem with evaluating a solution like this is that the precision of the different synonyms is usually very high for popular entities, and when authors combine different synonyms in the same query authors end up giving more weight to the results that use multiple synonyms.
 
In news articles the focus on [[named entities]] is quite common and usually a news case is tied around a person, a company, or similar. One challenge from an [[information retrieval]] point of view is that one entity often have more than one way of referring to it. This means that when users use news search engines they have to use the exact same name for the entity as the articles they are interested in use. Therefore the usage of synonyms to refer to the same entity forms the basis of this thesis. Authors explore the idea of using [[Wikipedia]] as a data source for building a large dictionary of named entities and their synonyms. An entity dictionary like that would be very interesting because it make it possible to link synonyms to the same entity. The evaluation shows that Wikipedia is well suited as a source of named entities and synonyms as the semi-structure aids in recognizing the entities and related synonyms. The use of the dictionary in a modified search solution shows on the other hand mixed results. On problem with evaluating a solution like this is that the precision of the different synonyms is usually very high for popular entities, and when authors combine different synonyms in the same query authors end up giving more weight to the results that use multiple synonyms.

Revision as of 06:26, 12 April 2021


Extracting Named Entities and Synonyms from Wikipedia for Use in News Search
Authors
Christian Bøhn
Publication date
2008
Links
Original

Extracting Named Entities and Synonyms from Wikipedia for Use in News Search - scientific work related to Wikipedia quality published in 2008, written by Christian Bøhn.

Overview

In news articles the focus on named entities is quite common and usually a news case is tied around a person, a company, or similar. One challenge from an information retrieval point of view is that one entity often have more than one way of referring to it. This means that when users use news search engines they have to use the exact same name for the entity as the articles they are interested in use. Therefore the usage of synonyms to refer to the same entity forms the basis of this thesis. Authors explore the idea of using Wikipedia as a data source for building a large dictionary of named entities and their synonyms. An entity dictionary like that would be very interesting because it make it possible to link synonyms to the same entity. The evaluation shows that Wikipedia is well suited as a source of named entities and synonyms as the semi-structure aids in recognizing the entities and related synonyms. The use of the dictionary in a modified search solution shows on the other hand mixed results. On problem with evaluating a solution like this is that the precision of the different synonyms is usually very high for popular entities, and when authors combine different synonyms in the same query authors end up giving more weight to the results that use multiple synonyms.