Outputs as inputs: Sequential Models of the Products of Infant ‘Statistical Learning’ of Language

AbstractTo explore whether current notions of statistically-based language learning could successfully scale to infants’ linguistic experiences “in the wild”, we implemented a statistical-clustering word-segmentation model (Saffran et al., 1997) and sent its outputs to an implementation of a “frame” based form class tagger (Mintz, 2003) and, separately, to a simple word-order heuristic parser (Gervain et al., 2008). We tested this pipeline model on various input types, ranging from quite idealized (orthographic words) to more naturalistic resyllabified corpora. We ask how these modeled capacities work together when they receive the noisy outputs of upstream word finding processes as input, which more closely resembles the scenario infants face in language acquisition.


Return to previous page