Organisms, and especially primates, are able to learn several skills while avoiding catastrophic interference and enhancing generalisation. This paper proposes a novel hierarchical reinforcement learning (RL) architecture with a number of features that make it suitable to investigate such phenomena. The proposed system combines the mixture of experts architecture with the neural-network actor-critic architecture trained with the TD() reinforcement learning algorithm. In particular, responsibility signals provided by two gating networks (one for the actor and one for the critic) are used both to weight the outputs of the respective multiple (expert) controllers and to modulate their learning. The system is tested with a simulated dynamic 2D robotic arm that autonomously learns to reach a target in (up to) three different conditions. The results show that the system is able to appropriately allocate experts to tasks on the basis of the differences and similarities among the required sensorimotor mappings.

A bioinspired hierarchical reinforcement learning architecture for modeling learning of multiple skills with continuous state and actions

Caligiore D;Mirolli M;Parisi D;Baldassarre Gi
2010

Abstract

Organisms, and especially primates, are able to learn several skills while avoiding catastrophic interference and enhancing generalisation. This paper proposes a novel hierarchical reinforcement learning (RL) architecture with a number of features that make it suitable to investigate such phenomena. The proposed system combines the mixture of experts architecture with the neural-network actor-critic architecture trained with the TD() reinforcement learning algorithm. In particular, responsibility signals provided by two gating networks (one for the actor and one for the critic) are used both to weight the outputs of the respective multiple (expert) controllers and to modulate their learning. The system is tested with a simulated dynamic 2D robotic arm that autonomously learns to reach a target in (up to) three different conditions. The results show that the system is able to appropriately allocate experts to tasks on the basis of the differences and similarities among the required sensorimotor mappings.
2010
Istituto di Scienze e Tecnologie della Cognizione - ISTC
reinforcement learning architecture
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/130399
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact