Controlling Attention in a Memory-Augmented Neural Network To Solve Working Memory Tasks.

AbstractWe introduce a memory-augmented neural network, called Differentiable Working Memory (DWM), that captures some key aspects of attention in working memory. We tested DWM on a suite of psychology inspired tasks, where the model had to develop a strategy only by processing sequences of inputs and desired outputs. Thanks to novel attention control mechanisms called bookmarks, the model was able to rapidly learn a good strategy - generalizing to sequence lengths even two orders of magnitude larger than that used for training - allowing it to retain, ignore or forget information based on its relevance. The behavior of DWM is interpretable and allowed us to analyze its performance on different tasks. Surprisingly, as the training progressed, we observed that in some cases the model was able to discover more than one successful strategy, possibly involving sophisticated use of memory and attention.

Return to previous page