According to most recent theories of multisensory integration, weighting of different modalities depends on the reliability of the involved sensory estimates. Top-down modulations have been studied to a lesser degree. Furthermore, it is still debated whether working memory maintains multisensory information in a distributed modal fashion, or in terms of an integrated representation. To investigate whether multisensory integration is modulated by task relevance and to probe the nature of the working memory encodings, we combined an object interaction task with a size estimation task in an immersive virtual reality. During the object interaction, we induced multisensory conflict between seen and felt grip aperture. Both, visual and proprioceptive size estimation showed a clear modulation by the experimental manipulation. Thus, the results suggest that multisensory integration is not only driven by reliability, but is also biased by task demands. Furthermore, multisensory information seems to be represented by means of interactive modal representations.