Intrinsically Motivated Reinforcement Learning (IMRL) has been proposed as a framework within which agents exploit "internal reinforcement" to acquire general-purpose building-block behaviors ("skills") which can be later combined for solving several specific tasks. The architectures so far proposed within this framework are limited in that: (1) they use hardwired "salient events" to form and train skills, and this limits agents' autonomy; (2) they are applicable only to problems with abstract states and actions, as grid-world problems. This paper proposes solutions to these problems in the form of a hierarchical reinforcement-learning architecture that: (1) exploits the ideas and techniques of Evolutionary Robotics to allow the system to autonomously discover "salient events"; (2) uses neural networks to allow the system to cope with continuous states and noisy environments. The paper also starts to explore a new way of producing intrinsic motivations on the basis of the learning progress of skills. The viability of the proposed approach is demonstrated with a simulated robotic scenario.
Evolving internal reinforcers for an intrinsically motivated reinforcement-learning robot
Schembri M;Mirolli M;Baldassarre G
2007
Abstract
Intrinsically Motivated Reinforcement Learning (IMRL) has been proposed as a framework within which agents exploit "internal reinforcement" to acquire general-purpose building-block behaviors ("skills") which can be later combined for solving several specific tasks. The architectures so far proposed within this framework are limited in that: (1) they use hardwired "salient events" to form and train skills, and this limits agents' autonomy; (2) they are applicable only to problems with abstract states and actions, as grid-world problems. This paper proposes solutions to these problems in the form of a hierarchical reinforcement-learning architecture that: (1) exploits the ideas and techniques of Evolutionary Robotics to allow the system to autonomously discover "salient events"; (2) uses neural networks to allow the system to cope with continuous states and noisy environments. The paper also starts to explore a new way of producing intrinsic motivations on the basis of the learning progress of skills. The viability of the proposed approach is demonstrated with a simulated robotic scenario.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.