Cross-situational cues are relevant for early word segmentation


Existing models of infant word learning have mainly assumed that the learner is capable of segmenting words from speech before grounding them to their referential meaning, while segmentation itself has been treated relatively independently of meaning acquisition. In this paper, we argue that situated cues such as visually perceived concrete objects or actions are not just important for word-to-meaning mapping, but that they are useful in pre-linguistic word segmentation, thereby helping the learner to bootstrap the language learning process. We present a model where joint acquisition of proto-lexical segments and their meanings maximizes the referential quality of the lexicon, and where learning can occur without any a priori knowledge of the language or its linguistically relevant units. We investigate the behavior of the model using a computational implementation of statistical learning, showing successful word segmentation under varying degrees of referential uncertainty.

