Learning Cross-linguistic Word Classes through Developmental Distributional Analysis

AbstractIn this paper, we examine the success of developmental distributional analysis in English, German and Dutch. We embed the mechanism for distributional analysis within an existing model of language acquisition (MOSAIC) that encodes increasingly long utterances, and compare results against a measure of ‘noun richness’ in child speech. We show that, cross-linguistically, the mechanism’s success in building an early noun class is inversely related to the complexity of the determiner and noun gender system, and that merging of determiners gives very similar results across languages. These results suggest that children may represent grammatical categories at multiple levels of abstraction that reflect both the larger category as well as its finer structure.

