Valence Weakly Constrains the Information Density of Messages

David VinsonUniversity of California, Merced, Merced, California, United States
Rick DaleUniversity of California, Merced


Some recent analyses of language as a transmission medium have fruitfully applied information theory in various ways to sequences of words. In most cases, the information contained in a word is defined as a function of that word’s local context (e.g., its probability conditioned on the preceding word). A central assumption in much of this work is the important role of context. For example, the hypothesis of uniform information density (Jaeger, 2010) requires some notion of context in order to be tested. We sought a structured corpus in order to extend and explore the potential role of a context in the observed information density of messages. Specifically, how might a language user’s affective state influence their language use? We used a database of over one hundred thousand consumer reviews that includes an assortment of user-related variables. These user-related variables, such as the overall rating of a review used here as a proxy for a user’s affect, appear to have an interesting relationship to basic information-theoretic measures such as the average amount and variability of observed information of a review's words. We discuss these results in terms of the broader context that may shape the information structure of messages, and relate these findings to existing theories.


