Why do we assume experimentation is about financial gain?

The over-simplification and virality of click-bait headlines for case studies designed for commercial gain and a wider audience has led to associating financial gain to an output. I blame Optimizely, the $300m button and WhichTestWon.

This is part of a 4-article series on whether, and how, you can attribute revenue to experimentation:

  1. Why do we assume experimentation is about financial gain?
  2. No, you can’t accurately attribute, nor forecast, revenue to experimentation. Here’s why.
  3. Yes. We need to attribute revenue to experimentation. Here’s how: Part 1
  4. Yes. We need to attribute revenue to experimentation. Here’s how: Part 2

A beautiful juxtaposition. A real email from a real client and one that I’ll never forget. Not because it wasn’t unusual asking to attribute individual and cumulative solutions. And then annualise them.

But because the sender went on to state that they were off to a 10-day TV advertising shoot. (and clearly understood my reticence in providing financial returns against investment)

It left me curious. If our contract renewal was so dependant on the attribution of 48x solutions, how must the bigger advertising agencies feel? You know, with the cost and subsequent lack of attribution of a national TV advertising campaign.

“…people hire brand agencies from New York with double digit rebranding and they don’t care at all about ROI. They hire for someone like us for AB tests and they start discussing if the uplift is 3.1% or 3.2%.” André Morys

OK, we’re being slightly flippant. I won’t comment on how much Beyonce cost for starring in her Pepsi commercial all those years ago (it was $50 bloody million). But I am curious how people assume that we can easily attribute and annualise revenue to an experiment. Or that they expect to get a financial gain from experimentation, period.

Perhaps it’s the assumption of CRO managers wearing white lab coats and the belief that were smart?

“We are supposed to be scientists with white lab coats and nerdy people — we should not only know what the result is, but exactly how much it’s going to pay back over the next 3 years.” Craig Sullivan

Or perhaps it’s the fact we’re number people. And therefore surely we must know what that return will be, because we deal with numbers all the time?

“Optimisers in the industry have created a culture and influence that’s born out of numbers. It’s all numbers numbers numbers. We put our blinkers on and tear off in the direction of those numbers” Chris Callaghan

I sound bitter, I know. It’s just; I firmly believe experimentation was not designed for the purpose of revenue attribution. It never was. You might say that the notion of an experiment was designed to prevent scurvy. The beauty of AB testing comes from validation; a level of confidence that your hypothesis is true or not. Why should that come with a financial gain?

To assign a financial figure against such an art is a gross over-simplification and commercialised view of what experimentation is designed to do. Much like the most recent, European Super League proposal commercialising the history and art of football in the Premier League. That’s probably the subconscious inspiration for this writing.

Hell, I ask, where did that notion of experimentation and financial benefit even come from?

Should experimentation be about revenue? It’s nonsensical to judge every aspect of your business on revenue just as it’s nonsensical to judge experimentation on just revenue, Hazjier Pourkhalkhali

Why do we think like this?

Why? Why is it so common for people to think that we can accurately attribute a revenue gain to AB tests or changes we make on site. Before the why, comes the how. How did this phenomena come about?

We’ll go into why that is the case in other musings next week, and the week after. And, like my writings on prioritisation, we must first understand why we hold a preconception that experimentation can easily be attributed, before we understand what to do about it.

AB testing is over-simplified

It should be to be more widely accessible.

Experimentation is a complex methodology where a lot of things can go wrong and the fate of a poor feature or bad experience rests on the shoulders of a decision. To accelerate adoption, we have had to over-simplify the methodology in of itself, in order to evangelise it. Therefore are we our own worst enemy?

Perhaps this is, indeed, our fault. We have set a rod for our own backs.

As the experts, we set the expectation for the rest of the industry. There are so many stupid case studies out there that directly attribute the experiment impact — people that do stupid tests that have a 70% confidence level, 70% uplift and create £70m euros for a client”. André Morys

André isn’t wrong. Does showing ‘related products’ bar above the reviews section really increase the revenue of my entire site by 28%? Or does moving the words “men” and “women” to either side of the logo in the navigation really increase site revenue by 9.4%?

* those are real examples - I was going to link to them, but thought I'd be thoroughly told off by outing such examples

Going back to the days of WhichTestWon, which was an online game that reported on individual experiments, users were asked to choose which experiment ‘won’ and subsequently provided the results of that experiment in £ and conversion % uplift.

When did we start attributing financial gain?

Optimizely might be one of the first.

Their ‘first’ piece of marketing was an absolute barn-stormer that they used for years to come. And so they should have done. I genuinely think it was a brilliant, engaging experiment. I understand why they don’t use it anymore given that a certain republican may have divided judgement, somewhat.

How Obama raised $60m by running a simple experiment

The experiment was really nice from a marketing perspective — it was gamified, well-known (I think used in almost every Optimizely presentation I saw between 2010 and 2017), demonstrated a surprising series of results, engaged, topical. But most importantly, it held the belief that a series of 4 x variations unlocked the key to a disproportionate revenue gain of $60m. It’s in the title. It grabs your attention and simplifies a complex concept.

Attributing financial gain to experimentation has long since been the staple of AB testing tools and agencies alike (inc. ourselves. More on that later)

Oh, wait, there’s the $300m button, too (published a year earlier than the Obama example). Much like a Manchester United transfer rumour, no one knows who this was for; but some say Best Buy. In essence, the button copy was changed — whether within an experiment or not, we do not know — replacing the “Register” button with “Continue as Guest”, and included some reassuring copy on what would happen next

And where are we now?

The popularity of and exposure of AB testing has led to a viral effect. If one company saw a $300m uplift from the changing of the wording of a button, then can these effects be replicated on a different website or platform?

The answer is assumed to be yes. And one (folly or not) case study, spawns another, with spawns another, which gets the attention of other professionals who try the same on their website; creating a domino effect of, what we now know today as , ‘best practice’.

That best practice is both in the solution, and the approach. The focus on the output, the numbers, and therefore the perceived measurability of what we do as a practice, has created a perfect storm of “a practice that generates unprecedented revenue and returns” and one that is “accurately measurable”.

“Because we say everything is measurable online — now people also believe that you can put an exact number on it. But that’s not possible. That’s the downside of believing that you can measure everything online.” Annemarie Klaassen

The sheer fact that the industry has framed experimentation on, what Morys calls “hard ROI” (which he goes on to say is a “a big pain in the ass”) means that those who purchase such services, tools or consume that content are conditioned to thinking in the same way.

People are becoming more and more educated — I believe but the damage has been done. We need to make the case why these case studies aren’t reliable and inform our stakeholders.

Am I being overly precious?

Sometimes I admit, I easily slip into practitioner, evangelical mode. It’s easy to do so. To fly the flag for those experimenting, rather than truly understanding how a single methodology fits into a wider business purpose, or accepting the maturity of a practice into a commercial field. That being said, I will never accept a ESL (European Super League) bid. We must be emotionally mature enough to appreciate how the piece fits into the jigsaw.

“It’s an area of grace; I don’t expect the FD to be as knowledge about experiment driven product development as I am; that’s an unrealistic expectation” Matt Lacey

At the end of the day, we’re just talking about attribution. And it’s been a difficult, almost subjective, nut to crack. Actually we’re talking about humans.

“CRO is not a pure science. You can’t truly separately out all your variables. We’re dealing with things that are just messy. We’re testing with humans and humans are complex.” Matthew Edgar

I’ll go into this in more detail in the next two or three posts.

Summary

It is difficult, if not impossible, to provide an accurate ROI for experimentation. The over-simplification and virality of click-bait headlines for case studies designed for commercial gain and a wider audience has led to associating financial gain to an output.

Should we do this? No.

Do we have to? Conversely, I believe so.

To add to Craig Sullivan’s quote above, we need to appreciate the business value of experiments. That doesn’t have to be a financial gain, but I believe it does help stakeholders understand impact. How that is communicated without being over-evangelical is something I’m delving into next week and the week after.

There’s no point being scientific about testing if you’re also not scientific about measuring the business value of experiments. Craig Sullivan

We need to help our stakeholders make informed decisions. Any attempts to quantify that are challenging, yes, regardless, they will still want to look at that overall impact.

  1. because financial gain is, ultimately, what they associate performance with
  2. because of the preconception that an experiment is directly linked to revenue gain

In the next article, I’ll be looking at why we can’t accurately attribute revenue to experimentation.

This is part of a 4-article series on whether, and how, you can attribute revenue to experimentation:

  1. Why do we assume experimentation is about financial gain?
  2. No, you can’t accurately attribute, nor forecast, revenue to experimentation. Here’s why.
  3. Yes. We need to attribute revenue to experimentation. Here’s how: Part 1
  4. Yes. We need to attribute revenue to experimentation. Here’s how: Part 2

Don’t miss my spurious thoughts on optimisation

I currently work just 2 days a week (and thoroughly enjoying it). But for the other 3 days, in a bid to keep me off Twitter and reading about the bull markets of Cryptocurrency, I learn, write and share. For nothing more than creating debate and enjoyment 😎

Subscribe to my newsletter at https://optimisation.substack.com/. New articles come out every Friday morning at 7am for you early risers.

Stories and advice within the world of conversion rate optimisation. Founder @ User Conversion. Global VP of CRO @ Brainlabs. Experimenting with 2 x children