When referring to a target object in a visual scene, speakers are assumed to consider certain distractor objects that are visible to be more relevant than others. However, previous research that has tested this assumption has mainly applied offline measures of visual attention, such as the occurrence of overspecification in speakers’ target descriptions. Therefore, in the current study, we take both online (eye-tracking) and offline (overspecification) measures of attention, to study how perceptual grouping affects scene perception, and reference production. We manipulated three grouping principles: region of space, type similarity, and color similarity. For all three factors, we found effects, either on eye movements (region of space), overspecification (color similarity), or both (type similarity). The results for type similarity provide direct evidence for the close link between scene perception and reference production.