We investigated how constraints on the concurrent visual context modulate the use of prior gender and action cues and stereotypical knowledge during situated language comprehension. Participants saw videos of hands performing actions, then inspected the pictures of potential agents (one female one male) during auditory German OVS sentences. Unlike in Rodríguez et al. 2015, the concurrent context included a picture of the videotaped object and a ‘competitor object’. Fixations to the agents' faces were measured during comprehension. We manipulated the match between videotaped actions and those described by the sentence and the stereotypicality match between the described actions and the gender of videotaped agents. We replicated the preference for the target agent's face (gender-matching the videotaped hands). Action-verb mismatch effects emerged earlier and were modulated by stereotypicality. Results suggest that visual availability of objects during comprehension facilitates the activation of representations from recent events and unseen events, favoring stereotypical expectations.