登录查看更多内容

Prioritizing for Experimentation Teams: A Transparency, Progress, and Outcome-Centered Approach We Call ADVIS'R

Natalie A. Thomas

Product Research, Strategy, & Testing · Experimentation Advocate ·?Mentor · UCLA Executive MBA ('26)

发布日期: 2024年3月13日

This article is based on an interview published on thegood.com.

When The Good started doing digital experience optimization, we were early practitioners. There weren’t numerous flourishing experimentation communities or a plethora of think-pieces about how to do it “right.”

We had to hire smart people and trust that they would figure it out.

Most consulting jobs rely on both smarts and experience to succeed. But at that time, as an industry, we were very short on applicants with experience. So our prioritization process developed out of trial and error. The smart people who worked here here were witnessing the beginning of optimization as a field of practice, and they were writing the rules as they played them.

15+ years later, we can truly say we have a “process” that gets us great results. It’s called the ADVIS’R method and it works well for teams like ours: outsourced optimization teams who need a lot of transparency, quick results, and a focus on outcomes.

In this article, I'll detail our prioritization process and what it incentivizes.

Who is the ADVIS'R prioritization method for?

For the most part, we work with teams who are already running experiments, want to develop a more systematic approach to experimentation, and have oversight from a decision-maker who wants transparency into the process.

So, we needed something that worked well in that context. We developed the ADVIS’R method accordingly.

The ADVIS’R prioritization?model is an acronym for the following:

Appropriate
Doable
Valuable
Important
Speedy
Ready

Some metrics are binary and some use the stoplight system (red-yellow-green).

Let's walk through it from the top

Appropriate

Appropriate is a binary metric used to determine if a concept is suited for testing over implementation or consideration.

All of the following filters have to be a “yes” for a concept to go to the next step:

Is it risky??If there is absolutely no risk, it’s probably not suited for testing and you can assess it for implementation instead of testing.
Is it in a priority testing area??Many people will have ideas about what needs to be “solved”, so we generally check to ensure that something is within the conversion funnel. If it’s not going to affect conversion metrics, it’s likely not suited for experimentation, because you need something to measure. ?
Would it reach significance within the designated time allowance? Every team has a different tolerance for test duration. I know teams that will let a test run for six months, and others that only want to prioritize initiatives that will see significance in two weeks. Having this litmus just assures folks are talking about their tolerance up-front.

Doable

In this step, we ask, “are we capable of running this experiment?”?

Sometimes a test is simply not feasible. Maybe it’s not feasible due to platform constraints, or the load time would be too impacted.

If the answer is a no then abort, full stop. Pivot to a measurement method other than A/B testing. It might be more suited for rapid testing, so you can still get validation, but it’s not going to be tested on the live site.

Once you determine that an initiative is appropriate for experimentation and doable, then you can move on to assessing the value, importance, and speed.

Valuable

The first stoplight metric in our framework is value.

Value is often the first thing folks think about when they think of experimentation. Most higher-ups think the primary benefit of experimentation is that, if done correctly, you get measurable data to confidently assess the value of a new treatment or mitigate the risk of loss.

Some companies like the Wall Street Journal use a simple calculation?to determine value. Our value assessment does something similar.

We determine value based on four factors:

How many users will see the treatment? More is generally better.
Does the test cater to high-intent users? If the test is very top of the funnel, like on blog pages, it’s never going to be as valuable as a test on product pages. We prioritize tests that occur where purchase decisions are more likely to be made.
What is the potential lift in KPI? Once you’ve been testing for a while, it becomes easier to determine the potential impact your experiment might have. We use online calculators to weigh potential uplift against audience size, but just knowing that a 1% lift to a large, high-intent audience is more important than a 1% lift in an equally sized, low-intent audience is a good start.
How much evidence do we have that this test might win? Based on our experience there are certain tests that almost always are a slam dunk and others that are unproven. If we’ve seen something work before, that’s a good reason to give it a higher rating.

Through the value-scoring exercise, we prioritize tests that have a demonstrated track record and will move the needle for the most users in a high-intent stage of the conversion journey.

Importance

While value to the business is, of course, what business is all about, solutions only work if they work for users. So importance is the next stoplight score in our system.

Importance takes into account both how impacted users would be by the change and what we might learn via this initiative.

A test is considered important based on 1) the amount of evidence showing that the current digital experience hinders user goals and 2) the potential to learn about the audience as an outcome of the experiment.

To determine importance we consider:

Will we learn something new about our audience?
Will this support decision-making on a near-term initiative?
How much evidence do we have that this is an issue worth solving?
Do we have loud detractors?

Measuring these things when we quantify “importance” ensures we don’t lose sight of customer experience in pursuit of revenue.

Another firm might not think of user needs as “important”, but one of our core values is “impact over income”, so we take user satisfaction, ease of use, and accessibility?seriously.

Measuring the mounting evidence in support of an initiative is our way of staying connected to the people on the other side of the screen and advocating for people who aren’t in the room.

领英推荐

Build Successful Products with an Impact Mindset

Connor Joyce 1 年前

Navigate uncertainty in strategy execution through…

Matthias Patzak 8 个月前

Experimentation Basics - What To Optimize?

Chris Hood 8 个月前

Speed

The last stoplight score in our framework is speed.

Speed score answers the question, “How long until we see the “value” analyzed in the value analysis step?”?

It’s a blended time calculation that takes into account both labor and opportunity costs.

When we are thinking about the speed score we consider:

Total time investment (asset collection & collaboration, design, development)
Time-to-significance

We don’t actually assign a numerical value or number of hours to speed because, borrowing from the Agile discipline, we know that people tend to have variable sense of time, and that people get a little skittish when they sense their time is being monitored closely.

So instead, we just use a simple litmus: fast, slow, or average.

We keep it simple because all we want to do is assess the investment being made. Assuming that any amount of time spent on one initiative is borrowed from another, all else being equal, we would weight a fast initiative higher in our roadmap.

It’s also a keen metric for an agency to keep an eye on because, in our experience, clients don’t like waiting around for development to manifest a prototype or for results to come to significance.

Readiness

The last step in our method is to assess readiness. This one is again binary. It’s basically yes/ no based on whether or not we have all the assets and there is roadmap conflict.

After doing this for so many years, we recognized that things were getting held up in development because an antsy team member was pushing them through too early.

Readiness is a good one to keep on the checklist because it helps you slow your roll a bit and prevent projects from getting hung up midstream due to missing assets or test audience conflict.

When we assess readiness, we:

Check for Location & Targeting Conflict —?We always stratify the test targeting so that there is no page-level conflict. We would not run multiple tests on the homepage at the same time, to the same audience, for instance.
Asset readiness?—?We recommend against prioritizing a test that does not have the appropriate assets ready to be developed. For instance, if a new set of product images will be needed to run the test, we wouldn’t prioritize that test until the images are ready for production.

Why are some items binary (yes/no) and some as stoplight (red-yellow-green)?

It’s all about optimizing for the desired outcomes and creating an amazing teaching tool.

We’ve found that the rigid numerical scorecard methods are too simplistic and don’t take into account the appetite of our stakeholders. Plus, practitioners hate them because it takes away their autonomy.

Our goals as a consultancy and partner are to make visible progress, get results, and improve the end-user experience. ?

The best thing about the ADVIS’R prioritization method is that it teaches practitioners how to think rather than telling them what to do. The stoplight system spurs conversations about difficulty, feasibility, user needs, and the trade-offs and opportunity costs of any decision.

Obviously, initiatives that are scored highly in all three (value, importance, and speed) easily float to the top.

But it’s what you do next that makes a good strategy, and there’s freedom in working each lever based on your company culture, priorities, organizational maturity, etc.

How do you prioritize based on the scores provided for Value, Importance, and Speed?

At the beginning of an engagement, speed is how we earn trust. Showing tests launched and closed tends to quickly prove that we’re working when we say we are, so we generally weigh speed a little more heavily early in a relationship.

But we can’t defer high-importance and high-value initiatives for long because ultimately we’ll be measured both on tangible outcomes like revenue earned and how much better the site “feels” which is a proxy for change. Flashy changes don’t tend to be speedy.

In general, our approach is to stratify tests to cover the bases:

Have at least one project at a time that is high speed which shows progress
Work on something high value so we’re progressing toward tangible metrics like revenue won
Progress on the important things that are flashy and satisfying and that users are most likely to care about

We find that stakeholders are usually the most satisfied when we see a blended mix of high-value, high-importance, and high-speed initiatives in the works.

Why does the ADVIS’R method work for us?

It helps program managers understand what makes a good test (appropriate and doable), assess opportunity cost (between value, importance, and speed), and only prioritize tests that are truly ready.

Our prioritization method is literally called ADVIS’R, partially as a nod to the fact that it works well in an outsourced context where we need to incentivize visibility, progress, and outcomes.

It’s not the simplest process, but it gets us our desired results, so we love it!

Final Advice

Do what works. Change it up if it doesn’t. Forget dogma.

The best way to find a good fit prioritization method for your experimentation practice is to just try something. Everyone can keep offering you advice, but unless you actually trial and error the method, you won’t know if it fits your organization’s needs.

TOMEK

1 年

Absolutely agree, finding the right prioritization model is key to aligning team efforts with strategic goals, and it's refreshing to see an article that acknowledges the need for a tailored approach rather than a one-size-fits-all solution.

1 次回应

要查看或添加评论，请登录

Natalie A. Thomas的更多文章

B2B Research Doesn’t Have To Be So Hard

2025年2月4日

B2B Research Doesn’t Have To Be So Hard

Whether your users are knowledge workers busy with deadlines or car mechanics who rarely leave the garage, connecting…

1 条评论
8 Product Pros on How to Ace Annual Planning

2024年12月3日

8 Product Pros on How to Ace Annual Planning

For those who embrace the annual planning ritual, Q4 can be a refreshing time to look back at what we’ve accomplished…

2 条评论
From Stalled to Streamlined: The Case for a Trusted Customer Insights Partner

2024年11月26日

From Stalled to Streamlined: The Case for a Trusted Customer Insights Partner

Whether you’re unearthing new use cases for a core audience, testing value propositions, or mitigating the risk of a…

1 条评论
10 Ways To Center Social Impact Throughout The Ecommerce Customer Journey

2024年10月22日

10 Ways To Center Social Impact Throughout The Ecommerce Customer Journey

In a world where packaging makes up 28% of municipal waste and 11.3 million tons of textile waste go to domestic…

4 条评论
I'm Running Experiments. Why Hasn't My Conversion Rate Gone Up?

2024年9月17日

I'm Running Experiments. Why Hasn't My Conversion Rate Gone Up?

The benefits of experimentation are well-researched and documented. All else being equal, once startups begin…

5 条评论
Smoke Test to Quickly Validate Your Ideas

2024年8月29日

Smoke Test to Quickly Validate Your Ideas

Whether you’re building products or marketing campaigns, time and money are precious commodities. There’s a lot of…

5 条评论
9 Use Cases For Verb Scoring To Support A Successful Product Strategy

2024年6月6日

9 Use Cases For Verb Scoring To Support A Successful Product Strategy

Note: This is part two in a two-part series on verb scoring. If you haven’t already, I recommend you read the first…
An Introduction to Verb Scoring: What It Is and How To Leverage It For Product Acquisition & Monetization

2024年5月21日

An Introduction to Verb Scoring: What It Is and How To Leverage It For Product Acquisition & Monetization

When deciding what, if any, parts of your product to give away for free, it can be difficult to find the balance…

16 条评论
Ecommerce Analytics Reports: Decision-Driven Data Analysis & Conventions That Mean More Than Benchmarks

2023年11月28日

Ecommerce Analytics Reports: Decision-Driven Data Analysis & Conventions That Mean More Than Benchmarks

The first step in any new digital experience optimization program is to build a strong understanding of the digital…

6 条评论
Doubling Down on Collaboration: Canva's $300 Teams Plan and the Tiny Design Decisions that Enabled It

2023年10月19日

Doubling Down on Collaboration: Canva's $300 Teams Plan and the Tiny Design Decisions that Enabled It

Much of what I've learned about product-led growth (or product marketing or PLG or whatever you want to call it) in my…

See all articles

Prioritizing for Experimentation Teams: A Transparency, Progress, and Outcome-Centered Approach We Call ADVIS'R

Natalie A. Thomas

Product Research, Strategy, & Testing · Experimentation Advocate ·?Mentor · UCLA Executive MBA ('26)

Who is the ADVIS'R prioritization method for?

Let's walk through it from the top

Appropriate

Doable

Valuable

Importance

领英推荐

Speed

Readiness

Why are some items binary (yes/no) and some as stoplight (red-yellow-green)?

How do you prioritize based on the scores provided for Value, Importance, and Speed?

Why does the ADVIS’R method work for us?

Final Advice

Natalie A. Thomas的更多文章

社区洞察

其他会员也浏览了

Mastering AI Product Leadership: How the Right 30-60-90 Day Plan Can Ensure Success and Satisfaction

7 Steps to Foster a Culture of Experimentation from Scratch

The AI-Driven Future of Product Leadership: Redefining the Role with Vision and Precision

Experimentation mindset for an experimentation culture

Why experiment prioritisation is so damn difficult

The Importance of having a Culture of Experimentation. Lessons learnt at Skrivanek.

Prerequisites to get to an experimentation culture - mindset (1/4)

How do you rate your experimentation program?

Finding Product-Market Fit: How Behavioral Clarity Accelerates Iteration

5 reasons your experimentation program maybe destined for destruction

Who is the ADVIS'R prioritization method for?

Let's walk through it from the top

Appropriate

Doable

Valuable

Importance

领英推荐

Speed

Readiness

Why are some items binary (yes/no) and some as stoplight (red-yellow-green)?

How do you prioritize based on the scores provided for Value, Importance, and Speed?

Why does the ADVIS’R method work for us?

Final Advice

Natalie A. Thomas的更多文章

B2B Research Doesn’t Have To Be So Hard

8 Product Pros on How to Ace Annual Planning

From Stalled to Streamlined: The Case for a Trusted Customer Insights Partner

10 Ways To Center Social Impact Throughout The Ecommerce Customer Journey

I'm Running Experiments. Why Hasn't My Conversion Rate Gone Up?

Smoke Test to Quickly Validate Your Ideas

9 Use Cases For Verb Scoring To Support A Successful Product Strategy

An Introduction to Verb Scoring: What It Is and How To Leverage It For Product Acquisition & Monetization

Ecommerce Analytics Reports: Decision-Driven Data Analysis & Conventions That Mean More Than Benchmarks

Doubling Down on Collaboration: Canva's $300 Teams Plan and the Tiny Design Decisions that Enabled It

社区洞察

其他会员也浏览了

Mastering AI Product Leadership: How the Right 30-60-90 Day Plan Can Ensure Success and Satisfaction

7 Steps to Foster a Culture of Experimentation from Scratch

The AI-Driven Future of Product Leadership: Redefining the Role with Vision and Precision

Experimentation mindset for an experimentation culture

Why experiment prioritisation is so damn difficult

The Importance of having a Culture of Experimentation. Lessons learnt at Skrivanek.

Prerequisites to get to an experimentation culture - mindset (1/4)

How do you rate your experimentation program?

Finding Product-Market Fit: How Behavioral Clarity Accelerates Iteration

5 reasons your experimentation program maybe destined for destruction