Introducing Feature Impact Analysis - measuring product impact beyond just usage and retention
Connor Joyce
Senior UXR at Microsoft | Writer, Speaker, Advisor | Ex- Twilio, BetterUp, Deloitte
Feature Impact Analysis
Digital products have transformed our modern economy. We are surrounded by devices that grab our attention, provide us with information, and request our time. This growth has occurred as companies maximize usage over all else. Yet, it has also sparked a backlash around what effect these recent technologies have on all aspects of society. Pundits and general users alike are beginning to request detailed accounting for the costs associated with the benefits they receive from their favorite digital services.
Preventing a thorough analysis of the impact of modern technologies has been the lack of accurate measurements of behavioral outcomes. Easily calculated, usage and retention have become the default standard for assessing a "successful" product. Developing more nuanced outcome metrics requires further defining the intention of the feature and the desired impact through usage. Behavioral science insights have played a pivotal role in driving this engagement and retention and now have a new responsibility to assist the development of outcome metrics.
Coinciding with the public's demand for measuring the impact of digital tools is the growth of products that directly attempt to assist users in changing their behaviors. Financial services assist users in generating savings; health products encourage behaviors that help clients lose weight and embrace mindfulness; productivity applications prevent distractions and facilitate better task management. These products' intentional design to create behavioral change has further emphasized the need for product teams to measure their features' impact.
Technologies designed to change behavior intend to alter a user's action outside of that digital environment. When usage and retention serve as success metrics, the product team makes the assumption that the feature is serving its designed purpose. Instead, teams must dig deeper and accurately lay out the behaviors they want to see the user change both inside the product and in the user's physical environment. Only with these measurements can a team reliably suggest their product changes outcomes.?
The remainder of this article will explore a new process designed to measure the outcomes created by features, which I call the Feature Impact Analysis. It intends to help Behavioral Science teams and product managers alike develop and use behavioral metrics to measure their features' impact. Aiding in the description of this process, I will use a hypothetical product whose purpose is to get users to visit the gym called Gymmy.
Levels of Outcome Metrics
Gymmy is an application that encourages users to get to the gym and report what types of exercises they did to help them progress towards health goals. It requires the user to log in what they did at the gym, but it can also use location and Apple Health data to infer that they showed up and did something while at the gym. The specific feature we will analyze is the goal-setting feature. Here the user is encouraged to fill out their goal and small steps to get there. Daily check-ins follow this to see how users' actions will help them achieve that outcome.
The desired behavioral change of this feature is the adoption of a goal-setting mindset and the users' continual visiting of the gym, and the desired outcome is the user feeling a sense of accomplishment as they journey towards their ideal state. Focusing just on the usage of the goal-setting feature, the team could see if the user turned it on and if they filled anything out, but this does little to show if an actual behavioral change has occurred. Only through analyzing behavioral outcome data, such as daily entries logged and sentiment towards goal, can the effectiveness truly be measured.
There are four main categories of data valuable for measuring a feature's effectiveness. Usage is the easiest to calculate but relies on the most significant assumption that the feature has its intended effect. Next is customer satisfaction (CSAT) through simple surveys asking on a scale how much someone enjoys the element or is likely to recommend it. CSAT scores get closer to understanding if the feature changes the desired behavior, but users may enjoy using something that is not ultimately helpful. The third tier is direct outcomes which measure the actual actions that an individual takes within the application or feature. These are harder to come by but are much closer to the measurement of real change. Lastly, behavioral outcomes or any data that support the actual change pursued through the usage of the application are the preeminent outcome metric.
Teams have traditionally relied on usage to be a default measurement as diving into behavioral outcomes requires a deeper connection to the specific results desired through use. Even when there is a desire to measure these behavioral outcomes, forces may lead teams to abandon the effort due to a lack of data or an inability to create metrics. This challenge is where a mixture of the behavioral data captured through usage and the theory of behavioral science is helpful to create proxies that are as close to the desired outcome as possible.
The business connection framework is a simple tool to help develop these proxy metrics. It starts with identifying the business outcome that a feature is trying to drive. The goal-setting section desires to alter the user's mindset towards pursuing activities to progress towards their expected results. One must then connect this business outcome to any behaviors which will make this happen. The connected primary behavior here is increasing daily physical activities and how those activities benefit the users. From here, the related behaviors must each have a proxy created for them.
Utilizing any de-identified user behaviors captured through product usage is generally the best path for creating outcome proxies. Start with the data collected through the feature but follow that with any downstream actions captured. Ideally, the team will end with at least one proxy for each behavior attempting to be changed. For the goal of increasing daily physical activity, proxies could be built using any user-generated information, data captured through integrations, or downstream data such as self-reports on how frequently one attended the gym. Building proxies for subjective outcomes such as increasing reflection is challenging but still possible. Again, using user-generated content is an option, but this is also an ideal opportunity to deploy a survey or another tool to capture user feedback.
领英推荐
Introducing the Feature Impact Analysis
With the business connection framework filled and the proxy measures developed, the next step is to compare feature users to non-users, termed the Feature Impact Analysis (FIA). Performing this FIA first requires gathering user data ideally in the thousands; successful ones that I have been a part of had around 15,000 users in each category with upwards of 50,000. Ideally, the analysis team will gather metadata, or at least it will be anonymized to prevent any privacy issues. A Coarsened Exact Matching analysis has proved to be the best method for measuring the effect, yielding two groups, the treatment that had used the feature and the control which had not. Utilizing this approach, the research team's goal is to make the two groups as similar as possible by selecting variables related to the outcome metrics of interest, ideally matching more than one variable. Upon establishing the groups, the analysis compares them to treatment and control.?
Continuing our example of Gymmy's goal-setting feature, two groups would be created by matching based on app usage frequency, the types of workouts that the users inputted, and the device they most commonly use when accessing the application. Then, the Gymmy feature team would look at the difference in exercise minutes completed between those who used the feature and those who did not. This would be the behavioral data portion measuring the change in direct outcomes; to push it further, they could incorporate a survey that asks users to rate their sentiment around their health. By aggregating these attitudinal data results by group, the team could determine if the feature helps drive actual behavioral outcomes.?
How this will empower teams
This feature impact analysis yields a new way to understand how features are aiding users in accomplishing their goals. In our example product, the team could validate that the feature achieves its original intent of encouraging increased physical activity and reflection. This process offers the ability to dive deeper into the mechanisms connected with users to create this positive effect. But the benefits of this analysis do not end there.
The research team can also utilize the metrics developed for further studies. The ideal place to start is looking at similarities among users who have the most significant benefit from the feature. For example, are those who report enjoying going to the gym more likely to return to the habit after taking a break for vacation or sickness? If so, are those users, in turn, likely to benefit even more from a nudge to return strong? Upon creating a profile of customers who benefit the most, targeting the feature can occur by utilizing messaging which highlights their unique potential for benefit.
Similarly, these metrics can help identify areas where the feature may be having an unintended consequence or those who may see a decrease in physical activity due to the extra time required to log everything. This type of analysis can help prevent areas where a feature may be causing a behavioral change in the opposite direction of its intent. Connecting back to one of the primary reasons measuring outcomes is becoming ever more critical, empowering a team with these metrics can help make the case that benefits outweigh costs and catch issues before they get out of control.
Utilizing outcome metrics for personalization and unintended consequence analysis is only helpful if the feature team can perform design iterating and testing with the outcomes in mind. A/B Tests in similar forms of experimentation have rapidly grown in usage over the past decade. Yet many of these tests still use engagement as the outcome variable. Leveraging these metrics as the outcome allows the research to determine which version of a feature is most effective at driving change, not just likely to catch one's attention or interest.
Looking Forward
Analyst teams can learn many new capabilities and insights through this analysis which has continued expanding across features. This process starts with defining the behaviors intended to be changed and the measurement proxies during design. Then, during feature development, the team can leverage the new outcome metrics to gauge usage outcomes immediately. Scaling this type of thinking will generate additional value by moving it earlier in the process, not waiting until development completion to test its effectiveness.
Meanwhile, the research team also ensures that the measured behaviors connect to the business outcomes through attitudinal data collected from users. Ideally, these surveys will be displayed within the user's flow and close to the value point to capture true feelings about the experience. Generating feedback in this way is essential because behavioral data only explains what happened, and attitudinal data drives why.
Understanding how features create outcomes will become ever more necessary as the industry grows with products to change behavior. Whether readers choose to replicate this process wholly or leverage the frameworks independently, assisting teams to focus on the behavioral outcomes of their features is the primary intention of this analysis. I hope that the user benefit analysis serves as a basis for teams trying to measure their behavioral outcomes.?
Shout out to V R Ramanan, Ahad Chaudhry, and Range Wang for being outstanding collaborators at Microsoft to help create and realize this work!