Cautions on being 'experiment driven'

I hear a lot of people say they want their startup to be more ‘experiment driven’. I’ve encouraged that behaviour a lot myself. Take bets, see how they progress, gather data and use the result to validate, or invalidate some sort of hypothesis.

If executed well, this approach can be a path to discovering things that you’d never find by incrementally taking steps on what you’re already doing. If executed poorly experiments can be both a bottleneck and an excuse to delay changes.

I got a lot of my bias for experiment driven process from an early career in academia. During my PhD, everything was an experiment. Step one was coming up with a hypothesis, then you design some way to prove yourself right or wrong. You look at the data, check that it’s not just a fluke and then deem the experiment a success. Successful experiments add evidence to a larger idea and the cycle continues.

If you’re building a business the process is fairly similar. You have an idea of how you can improve some part of the business. Usually it’s some metric like CAC or Churn but it could just be a feeling of “people will like this more”. You come up with a small way to test that idea, maybe a customer interview or an MVP in the early stages. As you mature it may be a full blown A/B test in your product. You let the test run its course and you look at the data to work out if your CAC did in fact decrease. If it did then you roll out the change to everyone, if it didn’t you scrap the test and try something else.

If only it were so simple.

To start with, there are plenty of ways that experimentation can go wrong. If you don’t have the budget or the usage to gather significant results quickly you can be blocked waiting. There’s also plenty of experiments that end up showing no real difference in outcomes. Lastly, if run poorly, an experiment heavy culture can become a blocker for making decisions.

Any experiment can have one of three different outcomes. One, the change you made is better. Two, the change you made is worse. Three, there’s not really enough data to know so for all intents and purposes they’re the same.

Statistical significance tests require enough data to show the change was not just random chance. The bigger the change, the less data you need. The reality is though, for most of the experiments you do, you’re looking at small steps up or down. When this is the case, you’ll need to wait for the numbers to come in. Ideally, you’re not also changing related things at the same time or you risk muddying the waters of causality.

Generally if you’re a growing startup you have two levers to increase sample size. If you’re testing something to do with marketing and onboarding, you can spend more money to fill the funnel. For most other tests, the lever you have is time. Budget and patience can both be hard to come by in a startup and I’ve seen many experiments cancelled early because of this.

Sometimes, cancelling the experiment is the right choice.

Many changes ultimately don’t shift metrics in a way that matters. Once you’ve saturated the market a 2% change in some metric can be game changing but early startup GTM is about finding the 20% or 100% shifts along the way. It might be new positioning or new features that shift these metrics significantly. You’re still working out your product and business so there’s no reason to believe that your current metrics are close to the maximum value.

When this is the case you’re better off aiming for velocity.

In most startups, there are big shifts in metrics waiting to be found. Running experiments remains a good way to find them. Because big changes require less data points, once you stumble across one the difference will be obvious quickly. You’ll still need to be minimally patient to avoid chasing shadows but you can’t let experiments become blockers to making choices.

The worst thing you can do is let the sentiment of “I don’t know, lets run an A/B test” become a crutch to avoid making key decisions. Your job as a leader is to make the important choices your business needs to grow. Waiting for the market to make those choices for you through a never ending series of A/B tests reduces them to a Silicon Valley ‘Hotdog/Not Hotdog’ AI designed to tell you 'good change' 'bad change' for each experiment you run.

The word ‘taste’ has been thrown around a lot recently. Often as a vague defense against things AI can do but in this context it is fairly important. Unless you have a truly astronomical budget you will not be able to A/B test your way from nothing to the fortune 500. Early choices, vision, mission and key decisions all need to be made quickly and explicitly. The biggest threat to any startup is running out of time, waiting for every decision to be irrefutable to p < 0.05 significance is a sure fire way to run out.