A crucial component of event recognition is understanding the roles that people and objects take: did the boy hit the girl, or did the girl hit the boy? We often make these categorizations from visual input, but even when our attention is otherwise occupied, do we automatically analyze the world in terms of event structure? In two experiments, participants made speeded gender judgments for a continuous sequence of male-female interaction scenes. Even though gender was orthogonal to event roles (whether the Agent was male or Female, or vice-versa), a switching cost was observed when the target character’s role reversed from trial to trial, regardless of whether the actors, events, or side of the target character differed. Crucially, this effect held even when nothing in the task required attention to the relationship between actors. Our results suggest that extraction of event structure in visual scenes is a rapid and automatic process.