Mastering Outliers in Marketing Mix Modelling (MMM): What You Need to Know
Wei Hutchinson, PhD
Marketing Analytics Consultant | Marketing Data Scientist | Quantitative Research Expert | MMM Specialist | Python | R | SQL | Digital Marketer | Educator | Enhancing Lives through Data Science & AI | 10+ Years
Hello there, Dr Wei here! I know, I know—I haven’t discussed MMM for a while. Consider this a follow-up to our previous conversation about the impending ‘cookie monster’—the shift to a cookieless world and how MMM is emerging as a resilient alternative. Today, we’re diving into a crucial aspect of MMM that can make or break your analysis: outliers.
What Are Outliers?
Imagine you're at a gathering of average-height people, and suddenly, a professional basketball player walks in. At over seven feet tall, they tower over everyone else in the room. That height would be an outlier—something that stands out because it’s significantly different from the norm.
In most situations, our first instinct when encountering outliers is to either remove them (which is like asking the basketball player to leave the room) or to transform them so they fit in better with the rest of the data (perhaps by asking everyone else to stand on a chair!). While outliers might represent legitimate and important events, they can distort your results if not handled correctly. That’s why in MMM, we need to be a bit more nuanced in our approach to outliers—neither simply ignoring them nor letting them skew our insights.
What’s Different About Outliers in MMM?
In the context of MMM, outliers are often tied to specific marketing activities. These aren’t just occasional anomalies; they can be frequent and are sometimes integral to understanding your marketing efforts. Here are some marketing activities examples:
These scenarios show that outliers are not just anomalies—they are often the result of significant marketing efforts that you can’t afford to ignore. But simply removing them isn’t the solution. So, what should you do?
Key Factors to Consider for Transformation
As a new MMM consultant, one of your first tasks is to figure out how to manage these outliers effectively. Before we dive into the specifics of different regression methods and transformations, let’s go over what you need to consider:
2. Magnitude of Outliers:
Consider how extreme your outliers are. Are they simply mild deviations, or do they represent significant spikes? Extreme outliers might benefit from methods like Winsorisation or Huber Regression, which address these anomalies without removing them. For less severe outliers, a milder transformation like square root or log might suffice.
3. Relationship Between Variables:
Look at the relationships between your variables. Are they linear or non-linear? Transformations such as log or Box-Cox can help linearise relationships, which is beneficial for regression models used in MMM. However, if your variables are already linear, you might opt for milder transformations to avoid distorting them.
4. Purpose of the Analysis:
Think about what you’re trying to achieve. Are you focused on making your model as interpretable as possible, or are you aiming for the highest predictive accuracy? For interpretability, you might prefer milder transformations. For accuracy, more aggressive options like Box-Cox could be necessary.
5. Multicollinearity:
Now, here’s a tricky little beast that can throw a wrench into your MMM analysis—multicollinearity. This is what happens when your variables start getting too cosy with each other, creating a tangled web of relationships that make it tough to figure out who’s actually driving the results.
For example, investments in different media channels—like TV, digital, and radio—often move in sync because they’re usually part of a coordinated campaign. Then you’ve got the economic indicators like GDP, CPI (Consumer Price Index), and interest rates, which tend to dance together since they all reflect the broader economic environment. When these variables are highly correlated, it can be a real headache trying to pinpoint the individual impact of each one.
One of the biggest challenges we face with multicollinearity is figuring out which channel is really pulling the strings. If TV and digital spending are both high, how do we know which one is driving sales? This uncertainty can lead to overestimating the impact of one channel while underestimating another, which can really mess up your budget allocation.
When you’re up against multicollinearity, tread carefully with your transformations. Techniques like log or Box-Cox can sometimes help by normalising the data or compressing the range, making the relationships less tangled. But watch out—sometimes these transformations can make the problem worse by introducing new correlations you didn’t see coming.
So, what’s the workaround? One approach is to aggregate those highly correlated variables before you start transforming them. For example, you could lump all your media spend into a single composite index or roll up the economic indicators into a broader economic index. This can help simplify your model, reduce the multicollinearity, and make it a lot easier to see what’s really going on. But fair warning—even with all this effort, completely untangling multicollinearity is no walk in the park. It’s one of those persistent challenges in MMM that we just have to navigate carefully.
6. Data Type and Scale:
Consider whether your data is continuous or categorical. Most transformations are designed for continuous data. If your data spans a wide range, transformations like log or Box-Cox can help compress the range and reduce the influence of extreme values.
领英推荐
7. Model Assumptions:
Does your model assume normally distributed errors or constant variance (homoscedasticity)? If so, transformations like log or square root that normalise your data and stabilise variance can be particularly helpful.
8. Ease of Implementation:
Finally, consider the complexity and computational resources required. Simpler transformations like log or square root are easier to implement, especially in production environments, whereas methods like Box-Cox might require more preprocessing.
Real-Life Scenarios of Outlier Handling
To bring these concepts to life, let's look at some real-world MMM scenarios:
1. Retail Industry: Optimising Holiday Campaigns
2. Financial Services: Promoting a New Credit Card with a Flexible Payment Scheme
3. Tech Industry: Launching a New Feature on a Dating App
Scenario: A popular dating app is rolling out a new feature that helps users find friends or business partners, in addition to romantic matches. To ensure the success of this new feature, the company needs to optimise its marketing mix across various channels.
Outlier Handling: Winsorisation is applied to cap extreme values in digital ad spend during major tech conferences and influencer promotions, preventing these spikes from skewing the results.
Action: By reallocating budget from traditional ads to influencer marketing and targeted social media campaigns, the app sees a 10% increase in feature adoption and engagement.
Final Thoughts
Handling outliers isn’t just about cleaning up your data—it’s about making sure your MMM analysis is accurate and reliable. By carefully considering the type of outliers, the nature of your data, and your specific goals, you can choose the right transformation method to improve your model’s performance.
#MarketingMixModelling #MMM #Outliers #DataTransformation #MarketingStrategy #DataDrivenInsights #DataScience
Join the Nerdy Marketing Scientists Community
If you’ve enjoyed diving into the world of Marketing Mix Modelling with me, why not stay connected? Follow me on LinkedIn for more insights, strategies, and the latest in marketing analytics. Let’s connect, share ideas, and grow our networks together. And hey, if you’re as passionate about marketing data science as I am, be sure to subscribe to my newsletter, Nerdy Marketing Scientists, where I explore all the nerdy details that help you stay ahead in the ever-evolving marketing landscape.
Stay informed, stay connected, and let’s keep the conversation going!
#MarketingMixModelling #MMM #Outliers #DataTransformation #MarketingStrategy #DataDrivenInsights #DataScience #MarketingDataScience
P.S. If you are unfamiliar with the transformation methods, I have made a table for you which provides a detailed comparison of different approaches, helping you decide which method is best suited for your specific scenario:
Data Scientist | Machine Learning | Marketing Science
6 个月Hello, can we apply Adstock and Saturation on transformed variables directly? And also how to Interpret the coefficients at the end when we do some log transformation to the data.