Making better decisions: explore versus exploit

Every day we are faced with decisions between options to do something where we know exactly how things will work out or to try something new. A classic example is how to choose a restaurant on any given night. You can either go out for a meal to a place you know you will get a good meal or try the new place you’ve been curious about. The first gives you good food but no new information, the second will give you new information—maybe a bad meal or maybe a new favorite.

Our intuition tells us that this is what life is about—known versus unknown, novelty versus comfort, risk versus enjoying what we know we love. While it might seem simple, it turns out that it’s remarkably difficult to set a strategy for making these decisions. So much depends on context, personal traits and biases that may lean us one way or the other.

Christian and Griffiths

How to choose?

In their popular book Algorithms to Live By, Brian Christian and Tom Griffiths explain that computer scientists have been working on balancing this tradeoff for half a century. In computing, exploration means gathering information while exploitation means using the information you have to get a predictably positive result.

The key insight from computer science is that decisions aren’t made in isolation; it’s not just about the next decision, but about all the decisions you will make about the same options in the future. This, in many ways, is what makes the explore/exploit dilemma so valuable to understand: it embodies a conflict evident in all human action. The value of trying new things can only go down over time because our opportunities to savor the results dwindle. We run out of time and the world changes. The flip side, according to Christian and Griffiths, is that the value of exploitation can only go up.

Explore when you will have time to use the resulting knowledge, exploit when you’re ready to cash in. The interval makes the strategy.

Christian and Griffiths

In general, people tend to over-explore, favoring novelty and what’s new over what’s best. But, as the world grows more complex and things change faster—new products, new competitors, new ideas—exploitation may not mean a “sure bet.” Purely exploring can be a rational choice.

Human learning is a paradox. As we learn more, we are less open to information from the environment. When we work together, people have different set points for how they view an explore/exploit tradeoff. We may think of someone as risk averse when, in fact, they have a different intuition about the need for new information versus the cost of gathering it. Another person may be highly novelty-seeking in all pursuits and their restlessness pulls a team’s decision into over-exploring, creating tension in a different way.

In these situations it can be helpful to remember:

  • It’s not so much about the next individual decision as about all the decisions to be made about the same options in the future. Examine the broader context and timing.
  • If humans have a bias, it’s to over-explore. Listening to those who have “seen this movie before” or have other ideas for exploitation can reveal under-exploited data.
  • Exploring can be costly which means failure has to be ok. Seen through the lens of explore/exploit, failure is the engine for learning; without failure, you’re just recycling what’s already known.
  • When the world is changing fast, there may be no choice but to explore. Simply put, what may once been a “sure bet” is now too unpreditable. If the data indicate that old strategies are no longer delivering predictably good results, start talking about switching to exploration.

This article is one in a series of hacks, tips and tricks for making better decisions.

Share on email
Share on facebook
Share on linkedin
Share on twitter