How do people explore in order to gain rewards in uncertain dynamical systems? Within a reinforcement learning paradigm, control normally involves trading off between exploration (i.e. trying out actions in order to gain more knowledge about the system) and exploitation (i.e. using current knowledge of the system to maximize reward). We study a novel control task in which participants must steer a boat on a grid, assessing whether participants explore strategically in order to produce higher rewards later on. We find that participants explore strategically yet conservatively, exploring more when mistakes are less costly and practicing actions that will be needed later on.