The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.

Automatic identification of domain terms: An approach for Italian

MT Artese;I Gagliardi
2020

Abstract

The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.
2020
Istituto di Matematica Applicata e Tecnologie Informatiche - IMATI - Sede Secondaria Milano
Inglese
Inglese
Inglese
Desislava Paneva-Marinova, Radoslav Pavlov, Peter Stanchev, Detelin Luchev
Digital Presentation and Preservation of Cultural and Scientific Heritage, International Conference Burgas, Bulgaria, 26-29 September 2024, Proceedings
Digital Presentation and Preservation of Cultural and Scientific Heritage 2020
10
251
257
7
https://www.ceeol.com/search/article-detail?id=902665
Esperti anonimi
24-26 September 2020
Burgas, Bulgaria
Internazionale
Classification Methods
Word Embedding Models
Probability
Food
Italian Language
Elettronico
2
open
Artese, Mt; Gagliardi, I
273
info:eu-repo/semantics/conferenceObject
04 Contributo in convegno::04.01 Contributo in Atti di convegno
File in questo prodotto:
File Dimensione Formato  
prod_433748-doc_155212.pdf

accesso aperto

Descrizione: Automatic Identification of Domain Terms: An Approach for Italian
Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 319.15 kB
Formato Adobe PDF
319.15 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/384396
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 0
social impact