One-shot learning of generative speech concepts

Brenden LakeMassachusetts Institute of Technology
Chia-ying LeeMassachusetts Institute of Technology
James GlassMassachusetts Institute of Technology
Josh TenenbaumMassachusetts Institute of Technology

Abstract

One-shot learning -- the human ability to learn a new concept from just one or a few examples -- poses a challenge to traditional learning algorithms, although approaches based on Hierarchical Bayesian models and compositional representations have been making headway. This paper investigates how children and adults readily learn the spoken form of new words from one example -- recognizing arbitrary instances of a novel phonological sequence, and excluding non-instances, regardless of speaker identity and acoustic variability. This is an essential step on the way to learning a word's meaning and learning to use it, and we develop a Hierarchical Bayesian acoustic model that can learn spoken words from one example, utilizing compositions of phoneme-like units that are the product of unsupervised learning. We compare people and computational models on one-shot classification and generation tasks with novel Japanese words, finding that the learned units play an important role in achieving good performance.

Files

One-shot learning of generative speech concepts (0.9 MB)



Back to Table of Contents