Dynamic Expectancy: An Approach to Behaviour Shaping Using a New Method of Reinforcement Learning

6th International Symposium on Intelligent Robotic Systems '98 (SIRS98),
Edinburgh, Scotland, UK, July 21-23, 1998

Mark Witkowski
Department of Computer Science,
Queen Mary Westfield College (University of London),
Mile End Road,
London E1 4NS

Abstract

This paper is concerned with issues relating to the source of reward and reinforcement with potential application to various robot learning and behaviour shaping situations (Dorigo and Colombetti, 1994; Lin, 1991, Maclin and Shavlik, 1996). The conventional approach to behaviour shaping by reinforcement learning is to present "reward" to an animal, animat or robot immediately following the performance by the animat of some required or desirable activity. It is a commonplace observation in experimental psychology that if this procedure is repeated a sufficient number of times by a trainer the behaviour of an animal will come to favour those activities in the circumstances under which they were reinforced.
This paper describes the Dynamic Expectancy Model, a new approach to issues in reinforcement learning that emphasises the role of internally generated "reward" signals, and in which overt behaviour is selected reactively from a policy map created dynamically in response to motivating "goals". The results of two investigations that illustrate these facets of learning and behaviour are presented. It is hoped that this technique will find application in a variety of task areas where animat/robot and man co-operate to address shared tasks.

Back to Publications