Older adults (OA) need to make many important and difficult decisions. Often, there are too many options available to explore exhaustively, creating the ubiquitous tradeoff between exploration and exploitation. How do OA make these complex tradeoffs? We investigated age-related shifts in solving exploration-exploitation tradeoffs depending on the complexity of the choice environment. Participants played four and eight option bandit problems with numbers of gambles and average rewards available on the screen. OA reliably performed worse in a more complex choice environment and were also more deviant from an optimality model (Thompson sampling), which keeps track of uncertainty beyond just the mean or last reward. OA seem to process important information in more complex choice environments sub-optimally, suggesting limited representations of future rewards. This interpretation fits to multiple contexts in the complex cognitive aging literature, in particular to the context of challenges in the maintenance of goal-directed learning.