Ontology evaluation through functional requirements—such as testing via competency question (CQ) verification—is a well-established yet costly, labour-intensive, and error-prone endeavour, even for ontology engineering experts. In this work, we introduce OE-Assist, a novel framework designed to assist ontology evaluation through automated and semi-automated CQ verification. By presenting and leveraging a dataset of 1,393 CQs paired with corresponding ontologies and ontology stories, our contributions present, to our knowledge, the first systematic investigation into large language model (LLM)-assisted ontology evaluation, and include: (i) evaluating the effectiveness of a LLM-based approach for automatically performing CQ verification against a manually created gold standard, and (ii) developing and assessing an LLM-powered framework to assist CQ verification with Protégé, by providing suggestions. We found that automated LLM-based evaluation with o1-preview and o3-mini perform at a similar level to the average user’s performance. Through a user study on the framework with 19 knowledge engineers from eight international institutions, we also show that LLMs can assist manual CQ verification and improve user accuracy, especially when suggestions are correct. Additionally, participants reported a marked decrease in perceived task difficulty. However, we also observed a reduction in human performance when the LLM provided incorrect guidance, showing a critical trade-off between efficiency and accuracy in assisted ontology evaluation.

Large Language Models Assisting Ontology Evaluation

Lippolis A. S.;Gangemi A.;Blomqvist E.;Nuzzolese A. G.
2026

Abstract

Ontology evaluation through functional requirements—such as testing via competency question (CQ) verification—is a well-established yet costly, labour-intensive, and error-prone endeavour, even for ontology engineering experts. In this work, we introduce OE-Assist, a novel framework designed to assist ontology evaluation through automated and semi-automated CQ verification. By presenting and leveraging a dataset of 1,393 CQs paired with corresponding ontologies and ontology stories, our contributions present, to our knowledge, the first systematic investigation into large language model (LLM)-assisted ontology evaluation, and include: (i) evaluating the effectiveness of a LLM-based approach for automatically performing CQ verification against a manually created gold standard, and (ii) developing and assessing an LLM-powered framework to assist CQ verification with Protégé, by providing suggestions. We found that automated LLM-based evaluation with o1-preview and o3-mini perform at a similar level to the average user’s performance. Through a user study on the framework with 19 knowledge engineers from eight international institutions, we also show that LLMs can assist manual CQ verification and improve user accuracy, especially when suggestions are correct. Additionally, participants reported a marked decrease in perceived task difficulty. However, we also observed a reduction in human performance when the LLM provided incorrect guidance, showing a critical trade-off between efficiency and accuracy in assisted ontology evaluation.
2026
Istituto di Scienze e Tecnologie della Cognizione - ISTC
9783032095268
9783032095275
Large Language Models
Ontology Engineering
Ontology Evaluation
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/559844
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact