Multitask model-free reinforcement learning

Andrew SaxeStanford University, Stanford, CA, USA


Conventional model-free reinforcement learning algorithms are limited to performing only one task, such as navigating to a single goal location in a maze, or reaching one goal state in the Tower of Hanoi block manipulation problem. It has been thought that only model-based algorithms could perform goal-directed actions, optimally adapting to new reward structures in the environment. In this work, we develop a new model-free algorithm capable of learning about many different tasks simultaneously, and mixing these together to perform novel, never-before-seen tasks. Our algorithm has the crucial property that, when performing a blend of previously learned tasks, it provably performs optimally. The algorithm learns a distributed representation of tasks, thereby avoiding the curse of dimensionality afflicting other hierarchical reinforcement learning approaches like the options framework that cannot blend subtasks. This result forces a reevaluation of experimental paradigms that use goal-directed behavior to argue for model-based algorithms.


Multitask model-free reinforcement learning (1 KB)

Back to Table of Contents