During requirements elicitation, different stakeholders with diverse backgrounds and skills need to effectively communicate to reach a shared understanding of the problem at hand. Linguistic ambiguity due to terminological discrepancies may occur between stakeholders that belong to different technical domains. If not properly addressed, ambiguity can create frustration and distrust during requirements elicitation meetings, and lead to problems at later stages of development. This paper presents a natural language processing approach to identify ambiguous terms between different domains, and rank them by ambiguity score. The approach is based on building domain-specific language models, one for each stakeholders' domain. Word embeddings from each language model are compared in order to measure the differences of use of a term, thus estimating its potential ambiguity across the domains of interest. We evaluate the approach on seven potential elicitation scenarios involving five domains. In the evaluation, we compare the ambiguity rankings automatically produced with the ones manually obtained by the authors as well as by multiple annotators recruited through Amazon Mechanical Turk. The rankings produced by the approach lead to a maximum Kendall's Tau of 88%. However, for several elicitation scenarios, the application of the approach was unsuccessful in terms of performance. Analysis of the agreement among annotators and of the observed inaccuracies offer hints for further research on the relationship between domain knowledge and natural language ambiguity.
An NLP approach for cross-domain ambiguity detection in requirements engineering
Ferrari A;Esuli A
2019
Abstract
During requirements elicitation, different stakeholders with diverse backgrounds and skills need to effectively communicate to reach a shared understanding of the problem at hand. Linguistic ambiguity due to terminological discrepancies may occur between stakeholders that belong to different technical domains. If not properly addressed, ambiguity can create frustration and distrust during requirements elicitation meetings, and lead to problems at later stages of development. This paper presents a natural language processing approach to identify ambiguous terms between different domains, and rank them by ambiguity score. The approach is based on building domain-specific language models, one for each stakeholders' domain. Word embeddings from each language model are compared in order to measure the differences of use of a term, thus estimating its potential ambiguity across the domains of interest. We evaluate the approach on seven potential elicitation scenarios involving five domains. In the evaluation, we compare the ambiguity rankings automatically produced with the ones manually obtained by the authors as well as by multiple annotators recruited through Amazon Mechanical Turk. The rankings produced by the approach lead to a maximum Kendall's Tau of 88%. However, for several elicitation scenarios, the application of the approach was unsuccessful in terms of performance. Analysis of the agreement among annotators and of the observed inaccuracies offer hints for further research on the relationship between domain knowledge and natural language ambiguity.File | Dimensione | Formato | |
---|---|---|---|
prod_411069-doc_164164.pdf
non disponibili
Descrizione: An NLP approach for cross-domain ambiguity detection in requirements engineering
Tipologia:
Versione Editoriale (PDF)
Dimensione
582.52 kB
Formato
Adobe PDF
|
582.52 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
prod_411069-doc_146506.pdf
accesso aperto
Descrizione: An NLP approach for cross-domain ambiguity detection in requirements engineering
Tipologia:
Versione Editoriale (PDF)
Dimensione
663.14 kB
Formato
Adobe PDF
|
663.14 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.