Skip to content
printicon
Show report in:

UMINF 09.20

Cognition Reversed - Robot Learning from Demonstration

The work presented in this thesis investigates techniques for learning from demonstration (LFD). LFD is a well established approach to robot learning, where a teacher demonstrates a behavior to a robot pupil. This thesis focuses on LFD where a human teacher demonstrates a behavior by controlling the robot via teleoperation. The robot should after demonstration be able to execute the demonstrated behavior under varying conditions.

Several views on representation, recognition and learning of robot behavior are presented and discussed from a cognitive and computational perspective. LFD-related concepts such as behavior, goal, demonstration, and repetition are defined and analyzed, with focus on how bias is introduced by the use of behavior primitives. This analysis results in a formalism where LFD is described as transitions between information spaces. Assuming that the behavior recognition problem is partly solved, ways to deal with remaining ambiguities in the interpretation of a demonstration are proposed.

A total of five algorithms for behavior recognition are proposed and evaluated, including the dynamic temporal difference algorithm Predictive Sequence Learning (PSL). PSL is model-free in the sense that it makes few assumptions of what is to be learned. One strength of PSL is that it can be used for both robot control and recognition of behavior. While many methods for behavior recognition are concerned with identifying invariants within a set of demonstrations, PSL takes a different approach by using purely predictive measures. This may be one way to reduce the need for bias in learning. PSL is, in its current form, subjected to combinatorial explosion as the input space grows, which makes it necessary to introduce some higher level coordination for learning of complex behaviors in real-world robots.

The thesis also gives a broad introduction to computational models of the human brain, where a tight coupling between perception and action plays a central role. With the focus on generation of bias, typical features of existing attempts to explain humans' and other animals' ability to learn are presented and analyzed, from both a neurological and an information theoretic perspective. Based on this analysis, four requirements for implementing general learning ability in robots are proposed. These requirements provide guidance to how a coordinating structure around PSL and similar algorithms should be implemented in a model-free way.

Keywords

No keywords specified

Authors

Back Edit this report
Entry responsible: Erik Billing

Page Responsible: Frank Drewes
2020-07-04