Interest in the use of knowledge graphs in the cybersecurity domain has been growing rapidly in recent years. However, due to the high specificity of the domain, some issues related to the robustness of these graphs are still open. This work analyses the results of a case study aimed at exploiting a BERT-based model - in particular word embed- dings - combined to a Knowledge graph approach to enhance the pop- ulation and enrichment of domain-oriented controlled vocabularies (i.e., Thesauri). Resources controlled and validated by domain experts, such as thesauri, are essential in high-risk domains like cybersecurity, where robustness and reliability are key factors. A Natural Language Process- ing inspired pipeline is presented, including knowledge graph extraction and inference to identify thesaural concepts and relationships. Although early findings suggest the model’s potential to enhance controlled vo- cabularies with novel insights, the accuracy and quality of the extracted entities still underscore the need for an in-depth validation by a domain expert to select the candidates concepts and relationships

Combining Knowledge graph and LLM to extract thesaural relationship and concepts on Cybersecurity

Elena Cardillo
Primo
;
Alessio Portaro
Secondo
;
Maria Taverniti
Ultimo
2025

Abstract

Interest in the use of knowledge graphs in the cybersecurity domain has been growing rapidly in recent years. However, due to the high specificity of the domain, some issues related to the robustness of these graphs are still open. This work analyses the results of a case study aimed at exploiting a BERT-based model - in particular word embed- dings - combined to a Knowledge graph approach to enhance the pop- ulation and enrichment of domain-oriented controlled vocabularies (i.e., Thesauri). Resources controlled and validated by domain experts, such as thesauri, are essential in high-risk domains like cybersecurity, where robustness and reliability are key factors. A Natural Language Process- ing inspired pipeline is presented, including knowledge graph extraction and inference to identify thesaural concepts and relationships. Although early findings suggest the model’s potential to enhance controlled vo- cabularies with novel insights, the accuracy and quality of the extracted entities still underscore the need for an in-depth validation by a domain expert to select the candidates concepts and relationships
2025
Istituto di informatica e telematica - IIT - Sede Secondaria Arcavacata di Rende
Istituto di Calcolo e Reti ad Alte Prestazioni - ICAR
978-3-031-85386-9
978-3-031-85385-2
978-3-031-85388-3
cybersecurity, knowledge graph, thesauri, BERT, relation extraction
File in questo prodotto:
File Dimensione Formato  
978-3-031-85386-9_11.pdf

solo utenti autorizzati

Tipologia: Versione Editoriale (PDF)
Licenza: Creative commons
Dimensione 772.14 kB
Formato Adobe PDF
772.14 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/513312
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact