Mining and Ranking Biomedical Synonym Candidates from Wikipedia

From Wikipedia Quality
Revision as of 12:02, 22 December 2020 by Audra (talk | contribs) (Adding wikilinks)
Jump to: navigation, search

Mining and Ranking Biomedical Synonym Candidates from Wikipedia - scientific work related to Wikipedia quality published in 2015, written by Abhyuday N Jagannatha, Jinying Chen and Hong Yu.

Overview

Biomedical synonyms are important resources for Natural Language Processing in Biomedical domain. Existing synonym resources (e.g., the UMLS) are not complete. Manual efforts for expanding and enriching these resources are prohibitively expensive. Authors therefore develop and evaluate approaches for automated synonym extraction from Wikipedia. Using the inter-wiki links, authors extracted the candidate synonyms (anchor-text e.g., “increased thirst”) in a Wikipedia page and the title (e.g., “polyuria”) of its corresponding linked page. Authors rank synonym candidates with word embedding and pseudo-relevance feedback (PRF). Authors results show that PRF-based reranking outperformed word embedding based approach and a strong baseline using interwiki link frequency. A hybrid method, Rank Score Combination, achieved the best results. Authors analysis also suggests that medical synonyms mined from Wikipedia can increase the coverage of existing synonym resources such as UMLS.