Explanation Versus Prediction: Statistical Differences in Detecting Fraudulent Events Do Not Necessarily Have Predictive Power

AbstractA large body of research in the cognitive sciences relies on examining statistical differences. While the approach of examining differences can aid in explaining behavior, it does not necessarily mean that these differences have predictive power. Yet, understanding behavior both involves explaining and predicting behavior. As a point in case, the current study used a naturalistic email dataset to examine statistical differences and predictive power in fraudulent activities. Differences between 1st and 3rd person pronoun use in liars and people telling the truth are widely reported in the literature. The current study aimed to test for the effect of fraudulent events on pronoun use in emails using the Enron corpus and additionally applied a machine learning approach to estimate whether pronoun use predicts fraud. While the ratio between 1st and 3rd person pronoun use was related to fraud, this construct did not have predictive power. The current study highlights an important conclusion for the cognitive sciences: The importance of not only testing for differences, but of also applying predictive models. In this way it can be determined whether effects of a construct on an outcome can also predict the outcome.

