Difference between revisions of "Detecting Domain-Specific Ambiguities: an Nlp Approach based on Wikipedia Crawling and Word Embeddings"

From Wikipedia Quality
Jump to: navigation, search
(Links)
(Infobox work)
Line 1: Line 1:
 +
{{Infobox work
 +
| title = Detecting Domain-Specific Ambiguities: an Nlp Approach based on Wikipedia Crawling and Word Embeddings
 +
| date = 2017
 +
| authors = [[Alessio Ferrari]]<br />[[Beatrice Donati]]<br />[[Stefania Gnesi]]
 +
| doi = 10.1109/REW.2017.20
 +
| link =
 +
}}
 
'''Detecting Domain-Specific Ambiguities: an Nlp Approach based on Wikipedia Crawling and Word Embeddings''' - scientific work related to [[Wikipedia quality]] published in 2017, written by [[Alessio Ferrari]], [[Beatrice Donati]] and [[Stefania Gnesi]].
 
'''Detecting Domain-Specific Ambiguities: an Nlp Approach based on Wikipedia Crawling and Word Embeddings''' - scientific work related to [[Wikipedia quality]] published in 2017, written by [[Alessio Ferrari]], [[Beatrice Donati]] and [[Stefania Gnesi]].
  
 
== Overview ==
 
== Overview ==
 
In the software process, unresolved natural language (NL) ambiguities in the early requirements phases may cause problems in later stages of development. Although methods exist to detect domain-independent ambiguities, ambiguities are also influenced by the domain-specific background of the stakeholders involved in the requirements process. In this paper, authors aim to estimate the degree of ambiguity of typical computer science words (e.g., system, database, interface) when used in different application domains. To this end, authors apply a [[natural language processing]] (NLP) approach based on [[Wikipedia]] crawling and word embeddings, a novel technique to represent the meaning of words through compact numerical vectors. Authors preliminary experiments, performed on five different domains, show promising results. The approach allows an estimate of the variation of meaning of the computer science words when used in different domains. Further validation of the method will indicate the words that need to be carefully defined in advance by the requirements analyst to avoid misunderstandings when editing documents and dealing with experts in the considered domains.
 
In the software process, unresolved natural language (NL) ambiguities in the early requirements phases may cause problems in later stages of development. Although methods exist to detect domain-independent ambiguities, ambiguities are also influenced by the domain-specific background of the stakeholders involved in the requirements process. In this paper, authors aim to estimate the degree of ambiguity of typical computer science words (e.g., system, database, interface) when used in different application domains. To this end, authors apply a [[natural language processing]] (NLP) approach based on [[Wikipedia]] crawling and word embeddings, a novel technique to represent the meaning of words through compact numerical vectors. Authors preliminary experiments, performed on five different domains, show promising results. The approach allows an estimate of the variation of meaning of the computer science words when used in different domains. Further validation of the method will indicate the words that need to be carefully defined in advance by the requirements analyst to avoid misunderstandings when editing documents and dealing with experts in the considered domains.

Revision as of 07:32, 3 December 2019


Detecting Domain-Specific Ambiguities: an Nlp Approach based on Wikipedia Crawling and Word Embeddings
Authors
Alessio Ferrari
Beatrice Donati
Stefania Gnesi
Publication date
2017
DOI
10.1109/REW.2017.20
Links

Detecting Domain-Specific Ambiguities: an Nlp Approach based on Wikipedia Crawling and Word Embeddings - scientific work related to Wikipedia quality published in 2017, written by Alessio Ferrari, Beatrice Donati and Stefania Gnesi.

Overview

In the software process, unresolved natural language (NL) ambiguities in the early requirements phases may cause problems in later stages of development. Although methods exist to detect domain-independent ambiguities, ambiguities are also influenced by the domain-specific background of the stakeholders involved in the requirements process. In this paper, authors aim to estimate the degree of ambiguity of typical computer science words (e.g., system, database, interface) when used in different application domains. To this end, authors apply a natural language processing (NLP) approach based on Wikipedia crawling and word embeddings, a novel technique to represent the meaning of words through compact numerical vectors. Authors preliminary experiments, performed on five different domains, show promising results. The approach allows an estimate of the variation of meaning of the computer science words when used in different domains. Further validation of the method will indicate the words that need to be carefully defined in advance by the requirements analyst to avoid misunderstandings when editing documents and dealing with experts in the considered domains.