What causes category-shifting in human semi-supervised learning?

Abstract

In a categorization task involving both labeled and unlabeled data, it has been shown that humans make use of the underlying distribution of the unlabeled examples. It has also been shown that humans are sensitive to shifts in this distribution, and will change predicted classifications based on these shifts. It is not immediately obvious what causes these shifts -- what specific properties of these distributions humans are sensitive to. Assuming a parametric model of human categorization learning, we can ask which parameters or sets of parameters humans fix after exposure to labeled data and which are adjustable to fit subsequent unlabeled data. We formulate models to describe different parameter sets which humans may be sensitive to and a dataset which optimally discriminates among these models. Experimental results indicate that humans are sensitive to all parameters, with the closest model fit being an unconstrained version of semi-supervised learning using expectation maximization.


Back to Table of Contents