We present a novel modeling framework for representing category exemplars and features. This approach treats each category exemplar as a probability distribution over a hierarchically structured graph. The model jointly learns the mixture of each exemplar across categories in the graph, and a feature representation for each node in the graph, including nodes for which no data is directly observed. We apply this model to two distinct types of data: Animal by Feature matrices, and Wikipedia documents. We demonstrate that this model is useful for learning feature representations for nodes in the graph that are not assigned any data (i.e. for generalization to new categories). Additionally this model improves the specificity of feature representations for the nodes with observed data by explaining away more general features to parent nodes. Furthermore, we illustrate that this model is useful for understanding additional psychological aspects of concept representation, such as typicality ratings.