Why aren't your experiments working?

Why aren't your experiments working?

Last week we covered how to design better experiments.?

This week, let's tackle something equally important: executing them without letting politics, bias, or wishful thinking corrupt your results.


Get buy-in before you begin

When someone finally asks about stakeholder alignment before launching a test...

The quality of your experiment matters less than your stakeholders' belief in the results.?

As humans, we're hard-wired to reject evidence that challenges our existing beliefs. For corporate innovators, this means your stakeholders will always find ways to dismiss test results they don't like - unless you've involved them from the start.?

Here's how to get better at bringing them on-side:?

  1. Map decision-makers and influencers for your project
  2. Involve critics early and get them to define success criteria with you
  3. Document your agreed-upon thresholds before testing
  4. Create a pre-mortem to flush out potential objections

A retail client of ours learned this the hard way. Their experiment showed clear evidence a new concept wouldn't work. But because they hadn't agreed with their sponsor on what "wouldn't work" meant before starting, the results kicked off months of debate instead of decisive action.?

When we ran the next test, we started differently:

  • Got the key sponsor to define success metrics with us
  • Documented their biggest concerns up front
  • Agreed on the exact go/no-go thresholds
  • Built answering their questions into the test design

When the data came in below the threshold (again!), the project was killed in one meeting. No debates, no politics.


Build the minimum viable test

Your test setup vs what you actually needed...?

The biggest mistake we see teams making in executing their experiments is building more than they need to answer their question. And it's not just wasteful - it actively damages your results.?

Adding in more complexity just increases variables, variables increase noise, and noise makes it harder to spot real signals. But we see teams consistently over-building their tests.

?Here's how to strip things right back and build the minimum viable test:

  1. Start with your hypothesis
  2. List every element needed to test it
  3. Ruthlessly eliminate everything else
  4. Build the simplest version that could give you an answer

A fintech client of ours wanted to test if users would trust AI for investment advice. Their initial plan was to build a full robo-advisor platform at a cost of £400,000 over 6 months.

We stripped it right back:

  • We spun up a simple landing page describing the AI advisor
  • Added a "Deploy £5,000" button
  • Tracked click-through rates and captured email addresses
  • Interviewed users who clicked

It took us two weeks. And we learned users wouldn't trust AI with their money before building anything.

Key questions for your test setup:

  • What's the simplest way to test your hypothesis?
  • What could you remove from it without affecting the core test?
  • Could you manually simulate parts instead of building them?


Make it real

Deploying your experiment without contaminating the results...?

You need to make your tests real enough to get genuine responses but controlled enough to get clean data.

Most teams get this backwards. They either make their test so "experimental" that users don't behave naturally or so polished that they can't isolate what's working.

Here's how you need to think about this:

  1. Create as close to real-world conditions as possible
  2. Control for outside variables
  3. Measure actual behaviour (not intentions)
  4. Document everything that could affect your results

We helped a healthcare company test a new patient monitoring service. But instead of building the tech, we:

  • Had nurses manually update patient data
  • Made it look automated to users
  • Tracked real behaviour and outcomes
  • Documented every manual intervention

Three weeks later, we knew exactly how patients would use the service - before writing a line of code.

Deployment checklist:

  • Are users experiencing this as they would in reality?
  • Have you controlled for external factors?
  • Are you measuring real behaviour?
  • Can you trace every interaction?


Watch and learn

When your experiment starts generating unexpected data...

The most valuable insights come from watching experiments unfold in real time. But you need to know what to watch for and how to adjust without invalidating your results.

The principle is simple: monitor enough to spot problems and opportunities, but not so much that you're tempted to interfere unnecessarily and disturb the test.

Here's what you need to do to monitor your experiments more effectively:

  1. Set up early warning metrics
  2. Define acceptable ranges for key indicators
  3. Create clear intervention triggers
  4. Document every adjustment

A media client was testing a new subscription model with us and during monitoring, we spotted something odd: a massive variance in conversion rates at different times.?

Here's what we saw:

  • US users converting at 3x the rate of other users
  • Mobile users dropping off at payment?
  • Price sensitivity varying massively by region

We adjusted the test to explore these patterns without compromising the core experiment.

Here's what you need to look out for:

  • What indicates the test is working as designed?
  • What would signal something's wrong?
  • What patterns might reveal new opportunities?
  • What changes would invalidate your results?


Capture what matters

Collecting the right data without disturbing your experiment...?

The key thing to remember in data collection is you can't go back and get what you didn't capture. But if you collect too much, you'll drown in noise. The goal isn't to collect everything. It's to collect the specific data that could prove you wrong.

Data collection framework:

  1. Start with your hypothesis
  2. Identify evidence that could disprove it
  3. Add context needed to understand results
  4. Create backup collection methods

An e-commerce client we worked with last year was testing a new checkout flow for a subset of users. But instead of just tracking conversion rates, we captured:

  • Every user action in the flow
  • Drop-off points and recovery attempts
  • User frustration signals
  • Technical performance data

When conversions were lower than expected, we had everything needed to understand why.?

This is what you need to do to avoid the same mistakes:

  • Be clear on what could disprove your hypothesis
  • Think about all the context you'd need to help you understand your results
  • Document all the things you think your stakeholders will ask to see

What This Means For You

Good execution is?about intentionality. Every choice in your experiment setup, deployment, and monitoring should tie back to answering your core question.

Before your next experiment:

  • Get stakeholder agreement on paper
  • Strip your test to its absolute minimum
  • Plan your monitoring triggers
  • Define your must-have data points

Next week: How to analyse experiment results and turn them into decisions that stick.

?Want help pressure-testing your experiment execution? Grab 15 minutes with me here.

要查看或添加评论,请登录

Future Foundry的更多文章