Semantic Typology and Parallel Corpora: Something about Indefinite Pronouns


Patterns of crosslinguistic variation in the expression of word meaning are informative about semantic organization, but most methods to study this are labor intensive and obscure the gradient nature of concepts. We propose an automatic method for extracting crosslinguistic co-categorization patterns from parallel texts, and explore the properties of the data as a potential source for automatically creating semantic representations for cognitive modeling. We focus on indefinite pronouns, comparing our findings against a study based on secondary sources (Haspelmath 1997). We show that using automatic methods on parallel texts contributes to more cognitively-plausible semantic representations.

Back to Table of Contents