Reading to young, pre-literate children is associated with better language and reading outcomes, but the underlying mechanisms are poorly understood. The goal of this work is to better understand the potential mechanisms. We hypothesized that vocabulary diversity and sentence complexity might vary between picture books and child-directed speech, and we wanted to quantify those potential differences. We built a corpus consisting of the text of 100 picture books that caregivers might read to pre-literate children. We compared the distributions of vocabulary and certain complex sentences of that corpus to child-directed speech from the CHILDES corpus. We found that picture books contained a higher number of unique word types for a given number of tokens, and contained a higher proportion of complex sentences. The mechanisms by which shared book reading may contribute to improved language outcomes is by exposing children to words and sentence structures that they would not encounter otherwise.