During requirements elicitation, different stakeholders with diverse backgrounds and skills need to effectively communicate to reach a shared understanding of the problem at hand. Linguistic ambiguity due to terminological discrepancies may occur between stakeholders that belong to different technical domains. If not properly addressed, ambiguity can create frustration and distrust during requirements elicitation meetings, and lead to problems at later stages of development. This paper presents a natural language processing approach to identify ambiguous terms between different domains, and rank them by ambiguity score. The approach is based on building domain-specific language models, one for each stakeholders' domain. Word embeddings from each language model are compared in order to measure the differences of use of a term, thus estimating its potential ambiguity across the domains of interest. We evaluate the approach on seven potential elicitation scenarios involving five domains. In the evaluation, we compare the ambiguity rankings automatically produced with the ones manually obtained by the authors as well as by multiple annotators recruited through Amazon Mechanical Turk. The rankings produced by the approach lead to a maximum Kendall's Tau of 88%. However, for several elicitation scenarios, the application of the approach was unsuccessful in terms of performance. Analysis of the agreement among annotators and of the observed inaccuracies offer hints for further research on the relationship between domain knowledge and natural language ambiguity.

An NLP approach for cross-domain ambiguity detection in requirements engineering

Ferrari A;Esuli A
2019

Abstract

During requirements elicitation, different stakeholders with diverse backgrounds and skills need to effectively communicate to reach a shared understanding of the problem at hand. Linguistic ambiguity due to terminological discrepancies may occur between stakeholders that belong to different technical domains. If not properly addressed, ambiguity can create frustration and distrust during requirements elicitation meetings, and lead to problems at later stages of development. This paper presents a natural language processing approach to identify ambiguous terms between different domains, and rank them by ambiguity score. The approach is based on building domain-specific language models, one for each stakeholders' domain. Word embeddings from each language model are compared in order to measure the differences of use of a term, thus estimating its potential ambiguity across the domains of interest. We evaluate the approach on seven potential elicitation scenarios involving five domains. In the evaluation, we compare the ambiguity rankings automatically produced with the ones manually obtained by the authors as well as by multiple annotators recruited through Amazon Mechanical Turk. The rankings produced by the approach lead to a maximum Kendall's Tau of 88%. However, for several elicitation scenarios, the application of the approach was unsuccessful in terms of performance. Analysis of the agreement among annotators and of the observed inaccuracies offer hints for further research on the relationship between domain knowledge and natural language ambiguity.
2019
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Ambiguity
Domain knowledge
Language models
Natural language processing
NLP
Requirements engineering
Word embeddings
File in questo prodotto:
File Dimensione Formato  
prod_411069-doc_164164.pdf

non disponibili

Descrizione: An NLP approach for cross-domain ambiguity detection in requirements engineering
Tipologia: Versione Editoriale (PDF)
Dimensione 582.52 kB
Formato Adobe PDF
582.52 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
prod_411069-doc_146506.pdf

accesso aperto

Descrizione: An NLP approach for cross-domain ambiguity detection in requirements engineering
Tipologia: Versione Editoriale (PDF)
Dimensione 663.14 kB
Formato Adobe PDF
663.14 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/386974
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 62
  • ???jsp.display-item.citation.isi??? 49
social impact