Flexible theft and resolute punishment: Evolutionary dynamics of social behavior among reinforcement-learning agents

James MacGlashanBrown University, Providence, Rhode Island, USA
Michael LittmanBrown University, Providence, Rhode Island, USA
Fiery CushmanBrown University, Providence, Rhode Island, USA

Abstract

Existing models of the evolution of social behavior typically involve innate strategies such as tit-for-tat. Yet, both behavioral and neural evidence indicates a substantial role for learned social behavior. We explore the evolutionary dynamics of two simple social behaviors among learning agents: Theft and punishment. In our simulation, agents employ Q-learning, a common reinforcement learning algorithm. Agents reproduce in proportion to the objective rewards they accrue, but the subjective reward function that guides learning and action evolves by natural selection. We find that agents typically evolve a bias to punish thieves that is sufficiently strong that it cannot be unlearned. Agents also typically evolve a bias to abstain from theft, but this is weak enough to permit rapid learning. This flexibility allows would-be thieves to exploit non-punishers. Finally, we show qualitatively similar results in a behavioral experiment on human participants: Flexible theft, but resolute punishment.

Files

Flexible theft and resolute punishment: Evolutionary dynamics of social behavior among reinforcement-learning agents (1 KB)



Back to Table of Contents