Child-directed speech is often temporally organized such that successive utterances refer to the same topic. This type of extended discourse on the same referent has been shown to possess several verbal signatures that could facilitate learning. Here, we reveal multiple non-verbal correlates to extended discourse that could also aid learning. Multimodal analyses of extended discourse episodes reveal that during these episodes, toddlers and parents exhibit greater sustained attention on objects, and greater coordination between their behaviors. The results indicate the interconnections between multiple aspects of the language-learning environment, and suggest that parents’ speech may both shape and be shaped by non-verbal processes. Implications for understanding how the learning environment influences development are discussed.