Linking, Searching, and Visualizing Entities for the Swedish Wikipedia

From Wikipedia Quality
Revision as of 07:48, 16 June 2019 by Sarah (talk | contribs) (Basic information on Linking, Searching, and Visualizing Entities for the Swedish Wikipedia)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Linking, Searching, and Visualizing Entities for the Swedish Wikipedia - scientific work related to Wikipedia quality published in 2016, written by Anton Södergren, Marcus Klang and Pierre Nugues.


In this paper, authors describe a new system to extract, index, search, and visualize entities on Wikipedia. To carry out the extraction, authors designed a high-performance entity linker and authors used a document model to store the resulting linguistic annotations. The entity linker ,HERD, extracts the mentions from text using a string matching Engine and links the mto entities with a combination of rules, PageRank, and feature vectors based on the Wikipedia categories. The document model, Docforia, consists of layers, where each layer is a sequence of ranges describing a specific annotation,here thee ntities. Authors evaluated HERD with the ERD’14 protocol (Carmel et al., 2014) and authors reached the competitive F1-score of 0.746 on the English development set. Authors applied HERD to the whole collection of Swedish articles of Wikipedia and authors used Lucene to index the layers and a search module to interactively retrieve articles and metadata given a title, a phrase, or a property. The user can then select an entity and visualize concordance in articles or paragraphs. A demonstration of the entity search and visualization is available for Swedish at this address: (Less)