Bayesian inference as a viable cross-linguistic word segmentation strategy: It’s all about what’s useful

Lawrence PhillipsUniversity of California, Irvine
Lisa PearlUniversity of California, Irvine

Abstract

Identifying useful items from fluent speech is one of the first tasks children must accomplish during language acquisition. Typically, this task is described as word segmentation, with the idea that words are the basic useful unit that scaffolds future acquisition processes. However, it may be that other useful items are available and easy to segment from fluent speech, such as sub-word morphology and meaningful word combinations. A successful early learning strategy for identifying words in English is statistical learning, implemented via Bayesian inference (Goldwater et al., 2009; Pearl et al., 2011; Phillips & Pearl, 2012). Here, we test this learning strategy on child-directed speech from seven languages, and discover it is effective cross-linguistically, especially when the segmentation goal is expanded to include these other kinds of useful units. We also discuss which useful units are easy to segment from the different languages using this learning strategy, as the useful unit varies across languages.

Files

Bayesian inference as a viable cross-linguistic word segmentation strategy: It’s all about what’s useful (134 KB)



Back to Table of Contents