The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.
Automatic identification of domain terms: An approach for Italian
MT Artese;I Gagliardi
2020
Abstract
The problem of creating a fully automated specific-domain thesaurus is very topical. The paper presents a novel method to address this problem in the Italian language. The main feature of this approach is the integration of different methods: machine learning classification methods working on the semantic representation of candidate terms, word embeddings models, able to capture the semantics of words, and a computation of the degree of specialization of a term. The work is in progress and results obtained so far are promising.File in questo prodotto:
| File | Dimensione | Formato | |
|---|---|---|---|
|
prod_433748-doc_155212.pdf
accesso aperto
Descrizione: Automatic Identification of Domain Terms: An Approach for Italian
Tipologia:
Versione Editoriale (PDF)
Licenza:
Creative commons
Dimensione
319.15 kB
Formato
Adobe PDF
|
319.15 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


