Discovering the Signatures of Joint Attention in Child-Caregiver Interaction

Guido PusiolStanford University
Laura SorianoStanford University
Michael C. FrankStanford University
Li Fei-FeiStanford University

Abstract

Joint attention—when child and caregiver share attention to an object or location—is an important part of early language learning. Identifying when two people are in joint attention is an important practical question for analyzing large-scale video datasets; in addition, identifying reliable cues to joint attention may provide insights into how children accomplish this feat. We use techniques from computer vision to identify features related to joint attention from both egocentric and fixed- camera videos of children and caregiver interacting with objects. We find that the presence of caregivers’ faces in the child’s egocentric view and the motion of objects in the fixed camera both correlate with human-annotated joint attention. We use a classifier to predict joint attention using these features and find some initial success; in addition, classifier performance is substantially increased by interpolating features across automatically-extracted “attention chunks” in the egocentric video.

Files

Discovering the Signatures of Joint Attention in Child-Caregiver Interaction (2.6 MB)



Back to Table of Contents