Skilled Bandits: Learning to Choose in a Reactive World

AbstractIn uncertain environments we must balance our need to gather information with our desire to exploit current knowledge. This is further complicated in reactive environments where actions produce long-lasting change. In three experiments, we investigate how people learn to make effective decisions from experience in a dynamic four-armed bandit task. In contrast to the diminishing rewards found in most previous studies, options were framed as skills that developed greater rewards when chosen. We find that most individuals learn effective strategies for coping with reactive environments. We present a psychological model positing that decision makers move through three distinct processing phases, and show that it accounts for key behavioral patterns across experiments.

