Animal Vocalization Generative Network (AVGN): A method for visualizing, understanding, and sampling from animal communicative repertoires

AbstractWe propose here a set of machine-learning algorithms to produce a generative low-dimensional and visually-understandable space of the communicative repertoire of vocal species such as songbirds. As opposed to human speech, where individual elements are well defined and grounded in principled ways, the methods for defining units of animal communication systems are often more varied and rely on human-centric heuristics. Using our method, we can automatically discover latent structure in the vocal repertoire of individuals and use these to define-well principled categorical boundaries between vocal elements in communicating species. Further, we can sample from latent representations to generate novel vocal units that can be used to probe perceptual and physiological representations of communication. We demonstrate two use cases: (1) automated labeling of songbird vocal repertoires showing novel structure in vocal communication, and (2) a perceptual task demonstrating that behavioral and physiological representational spaces can be biased by contextual information.

Return to previous page