CNR Institutional Research Information System

Ontology evaluation through functional requirements—such as testing via competency question (CQ) verification—is a well-established yet costly, labour-intensive, and error-prone endeavour, even for ontology engineering experts. In this work, we introduce OE-Assist, a novel framework designed to assist ontology evaluation through automated and semi-automated CQ verification. By presenting and leveraging a dataset of 1,393 CQs paired with corresponding ontologies and ontology stories, our contributions present, to our knowledge, the first systematic investigation into large language model (LLM)-assisted ontology evaluation, and include: (i) evaluating the effectiveness of a LLM-based approach for automatically performing CQ verification against a manually created gold standard, and (ii) developing and assessing an LLM-powered framework to assist CQ verification with Protégé, by providing suggestions. We found that automated LLM-based evaluation with o1-preview and o3-mini perform at a similar level to the average user’s performance. Through a user study on the framework with 19 knowledge engineers from eight international institutions, we also show that LLMs can assist manual CQ verification and improve user accuracy, especially when suggestions are correct. Additionally, participants reported a marked decrease in perceived task difficulty. However, we also observed a reduction in human performance when the LLM provided incorrect guidance, showing a critical trade-off between efficiency and accuracy in assisted ontology evaluation.

Large Language Models Assisting Ontology Evaluation

Lippolis A. S.;Saeedizade M. J.;Keskisarkka R.;Gangemi A.;Blomqvist E.;Nuzzolese A. G.

2026

Abstract

Ontology evaluation through functional requirements—such as testing via competency question (CQ) verification—is a well-established yet costly, labour-intensive, and error-prone endeavour, even for ontology engineering experts. In this work, we introduce OE-Assist, a novel framework designed to assist ontology evaluation through automated and semi-automated CQ verification. By presenting and leveraging a dataset of 1,393 CQs paired with corresponding ontologies and ontology stories, our contributions present, to our knowledge, the first systematic investigation into large language model (LLM)-assisted ontology evaluation, and include: (i) evaluating the effectiveness of a LLM-based approach for automatically performing CQ verification against a manually created gold standard, and (ii) developing and assessing an LLM-powered framework to assist CQ verification with Protégé, by providing suggestions. We found that automated LLM-based evaluation with o1-preview and o3-mini perform at a similar level to the average user’s performance. Through a user study on the framework with 19 knowledge engineers from eight international institutions, we also show that LLMs can assist manual CQ verification and improve user accuracy, especially when suggestions are correct. Additionally, participants reported a marked decrease in perceived task difficulty. However, we also observed a reduction in human performance when the LLM provided incorrect guidance, showing a critical trade-off between efficiency and accuracy in assisted ontology evaluation.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Strutture organizzative
	
				Istituto di Scienze e Tecnologie della Cognizione - ISTC
			
	Codice ISBN
	
				9783032095268
9783032095275
			
	Parole chiave
	
				Large Language Models
Ontology Engineering
Ontology Evaluation

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/559844

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni

ND

ND

ND

social impact