CNR Institutional Research Information System

Recent advances in Large Language Models (LLMs) have demonstrated their capabilities across a variety of tasks. However, automatically extracting implicit knowledge from natural language remains a significant challenge, as machines lack direct experience with the physical world. Given this scenario, semantic knowledge graphs can guide LLMs to achieve more efficient and explainable results. In this paper, we apply a Logic Augmented Generation framework that leverages the explicit representation of a text through a semantic knowledge graph and applies it in combination with prompt heuristics to elicit implicit analogical connections. This method generates extended knowledge graph triples representing implicit meaning, enabling systems to reason on unlabeled multimodal data regardless of the domain. We validate our work through three metaphor detection and understanding tasks across five datasets, text-based, domain-specific, and visual, as they require deep analogical reasoning capabilities. The results show that the proposed integrated approach surpasses current baselines and performs better than humans in understanding visual metaphors. It also provides justifications for the reasoning processes, yet remains susceptible to shortcut cues and cross-modal interference. The error analysis discusses issues with existing metaphor datasets and current evaluation and annotation methods.

Enhancing multimodal analogical reasoning with logic augmented generation

Lippolis A. S.;Nuzzolese A. G.;Gangemi A.

2026

Abstract

Recent advances in Large Language Models (LLMs) have demonstrated their capabilities across a variety of tasks. However, automatically extracting implicit knowledge from natural language remains a significant challenge, as machines lack direct experience with the physical world. Given this scenario, semantic knowledge graphs can guide LLMs to achieve more efficient and explainable results. In this paper, we apply a Logic Augmented Generation framework that leverages the explicit representation of a text through a semantic knowledge graph and applies it in combination with prompt heuristics to elicit implicit analogical connections. This method generates extended knowledge graph triples representing implicit meaning, enabling systems to reason on unlabeled multimodal data regardless of the domain. We validate our work through three metaphor detection and understanding tasks across five datasets, text-based, domain-specific, and visual, as they require deep analogical reasoning capabilities. The results show that the proposed integrated approach surpasses current baselines and performs better than humans in understanding visual metaphors. It also provides justifications for the reasoning processes, yet remains susceptible to shortcut cues and cross-modal interference. The error analysis discusses issues with existing metaphor datasets and current evaluation and annotation methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Strutture organizzative
	
				Istituto di Scienze e Tecnologie della Cognizione - ISTC
			
	Parole chiave
	
				Analogical reasoning
Conceptual blending theory
Large language models
Logic augmented generation
			
	Appare nelle tipologie:
	
				01.01 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/584482

Citazioni

ND

0

ND

social impact