The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.

Automatic identification of domain terms: An approach for Italian

MT Artese;I Gagliardi
2020

Abstract

The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.
2020
Istituto di Matematica Applicata e Tecnologie Informatiche - IMATI - Sede Secondaria Milano
Classification Methods
Word Embedding Models
Probability
Food
Italian Language
File in questo prodotto:
File Dimensione Formato  
prod_433748-doc_155212.pdf

accesso aperto

Descrizione: Automatic Identification of Domain Terms: An Approach for Italian
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 319.15 kB
Formato Adobe PDF
319.15 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/384396
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact