Information versus reward in a changing world

Daniel NavarroUniversity of Adelaide
Ben NewellUniversity of New South Wales


How do people solve the explore-exploit trade-off in a changing environment? In this paper we present experimental evidence in an "observe or bet" task, comparing human behavior in a changing environment to their behavior in an unchanging one. We present a Bayesian analysis of the observe or bet task and show that human judgments are consistent with that analysis. However, we find that people's behavior is most consistent with a Bayesian model that assumes a rate of change that is higher than the true rate in the task. We argue that this tendency is the result of asymmetric consequences: assuming that the world changes more often than it really does is not very costly, whereas assuming a too-low rate of change can carry much more severe consequences.


Information versus reward in a changing world (197 KB)

