This work presents a novel architecture for an open-ended learning system that integrates intrinsic motivation (IM) and classical planning to enable agents continuously learn and improve their knowledge over time in an unsupervised fashion. The main goal is to allow the agent to autonomously distill its experience into Probabilistic Planning Domain Definition Language (PPDDL) terms, thereby making causal relationships explicit and supporting automated planning. Starting with a virtually empty set of predefined tasks or goals, the agent harnesses intrinsic motivation to explore the environment autonomously, continuously using and enriching the high-level knowledge acquired through its experience in a virtuous cycle. Experimental evaluation in the Treasure Game domain demonstrates the effectiveness of the proposed approach: starting with only a small set of primitive actions, we show how an agent can autonomously build and refine a high-level representation of the environment. Planning-based strategies grounded in this representation significantly outperform uninformed exploration by reaching intermediate sub-goals more efficiently and substantially reducing the time required to achieve the final objective.

Synthesizing Evolving Symbolic Representations for Autonomous Systems

Sartor, Gabriele;Oddi, Angelo;Rasconi, Riccardo;Santucci, Vieri Giuliano;
2026

Abstract

This work presents a novel architecture for an open-ended learning system that integrates intrinsic motivation (IM) and classical planning to enable agents continuously learn and improve their knowledge over time in an unsupervised fashion. The main goal is to allow the agent to autonomously distill its experience into Probabilistic Planning Domain Definition Language (PPDDL) terms, thereby making causal relationships explicit and supporting automated planning. Starting with a virtually empty set of predefined tasks or goals, the agent harnesses intrinsic motivation to explore the environment autonomously, continuously using and enriching the high-level knowledge acquired through its experience in a virtuous cycle. Experimental evaluation in the Treasure Game domain demonstrates the effectiveness of the proposed approach: starting with only a small set of primitive actions, we show how an agent can autonomously build and refine a high-level representation of the environment. Planning-based strategies grounded in this representation significantly outperform uninformed exploration by reaching intermediate sub-goals more efficiently and substantially reducing the time required to achieve the final objective.
2026
Istituto di Scienze e Tecnologie della Cognizione - ISTC
Abstraction
Artificial intelligence
Open-ended learning
PPDDL
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/579882
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ente

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact