Recently, Question Answering (QA) systems have emerged as effi‐ cient solutions for helping users find proper answers to questions pertaining to a specific situation. One of the major modern paradigms for QA is based on Infor‐ mation Retrieval (IR) techniques, where the text of a user question is evaluated in order to extract a collection of relevant keywords, formulate queries on the top of them for a search engine and extract candidate answers from documents matching with the queries. Nevertheless, in the case of semantically complex and rich languages, like Italian, many concepts can be expressed in a variety of distinct linguistic forms. This problem particularly arises when QA is applied to smaller sets of documents pertaining to a closed domain, where an answer might appear only once, and its exact wording might differ partially or completely from the one used in the query. To solve this issue, this paper proposes a hybrid approach of Query Expansion (QE) where lexical resources and word embeddings (WEs) are combined to generate synonyms and hypernyms of relevant words extracted from the user question and contextualize this set with respect to the corpus of interest and with respect to the peculiar question. An experimental session has been arranged in order to compare the proposed QE approach with other different techniques and evaluate its impact of with respect to the accuracy of a QA system in extracting proper answers to factoid questions from documents pertaining to the Cultural Heritage domain. The experiments showed the effectiveness of the proposed solution with respect to three different evaluation metrics typically used in literature.

Query expansion based on wordnet and word2vec for italian question answering systems

Damiano, E.
;
Minutolo, A.;Silvestri, S.;Esposito, M.
2018

Abstract

Recently, Question Answering (QA) systems have emerged as effi‐ cient solutions for helping users find proper answers to questions pertaining to a specific situation. One of the major modern paradigms for QA is based on Infor‐ mation Retrieval (IR) techniques, where the text of a user question is evaluated in order to extract a collection of relevant keywords, formulate queries on the top of them for a search engine and extract candidate answers from documents matching with the queries. Nevertheless, in the case of semantically complex and rich languages, like Italian, many concepts can be expressed in a variety of distinct linguistic forms. This problem particularly arises when QA is applied to smaller sets of documents pertaining to a closed domain, where an answer might appear only once, and its exact wording might differ partially or completely from the one used in the query. To solve this issue, this paper proposes a hybrid approach of Query Expansion (QE) where lexical resources and word embeddings (WEs) are combined to generate synonyms and hypernyms of relevant words extracted from the user question and contextualize this set with respect to the corpus of interest and with respect to the peculiar question. An experimental session has been arranged in order to compare the proposed QE approach with other different techniques and evaluate its impact of with respect to the accuracy of a QA system in extracting proper answers to factoid questions from documents pertaining to the Cultural Heritage domain. The experiments showed the effectiveness of the proposed solution with respect to three different evaluation metrics typically used in literature.
2018
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR - Sede Secondaria Napoli
9783319698342
9783319698359
Question Answering
Word Embeddings
WordNet
Knowledge Bases
Cultural Heritage
File in questo prodotto:
File Dimensione Formato  
Pubblicazione10.pdf

non disponibili

Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 833.37 kB
Formato Adobe PDF
833.37 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/503281
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? ND
social impact