Building a Decision Framework for Selecting Machine Learning Models as a Product Manager
Maie ElZeiny
Senior Product Manager at Zalando | AI/ ML/ Data Products - MSc Business Analytics
As a product manager, you're always trying to find ways to solve your customers' problems - and sometimes, machine learning (ML) is the right tool for the job. But with so many different models out there, how do you figure out which one actually fits your problem? ??
In this post, I'll break down a simple and structured way to find the right ML solution for your customer's needs. We'll use a fashion e-commerce example (assortment optimization) to see how to balance accuracy, making sense of the data, and scaling things up – so you can make informed decisions? and work well with your data science team. Let’s dive in! ??
?? ML Model Selection Framework
This framework includes six steps:
- Understand the Problem
- Define the Objective and Metrics
- Evaluate the Data
- Categorize the Problem Type
- Collaborate with Data Scientists
- Iterate and Test
Let’s break it down!
Step 1: Understand the Problem
The first mistake many PMs make? Diving into ML without a crystal-clear understanding of the problem! Start by understanding the business impact and stakeholder needs. Ask yourself:
- Who will use the model? (Buyers, marketers, operations teams)
- What decisions need automation or support?
- What is the business impact? (Revenue increase, cost savings, operational efficiency)
- Are there practical constraints? (Real-time processing, interpretability requirements)
?? Example: Demand Forecasting in Fashion E-Commerce
Challenge: Some fashion categories, such as sneakers, are frequently out of stock, while others, like formal shoes, remain unsold for months, leading to lost sales and increased inventory costs.
Goal: Use ML to recommend the optimal assortment mix for each region and season to reduce stockouts and overstocking.
Who will use the model?
- Buyers and Merchandisers: Optimize assortment mix and ensure high-demand products are stocked appropriately.
- Marketing Teams: Adjust promotional strategies based on anticipated demand for different products.
What decisions need automation or support?
- Stock Allocation: Determine the right quantity of each product to stock in each region.
- Assortment Planning: Identify which products should be prioritized based on seasonal trends and customer preferences.
Practical constraints
- Real-time processing is not required; demand forecasting can operate on a daily or weekly batch update basis.
- Interpretability is moderately important; buyers and inventory teams need clear explanations for recommendations.
- Scalability is crucial; the model must work across multiple regions and product categories, incorporating external factors such as weather, social media trends, and competitor pricing.
- Data availability varies; newly launched products may have limited historical data, requiring alternative approaches like trend-based forecasting or content-based recommendations.
Impact
- Increased revenue by ensuring popular items remain in stock.
- Cost savings through minimized excess inventory.
- Operational efficiency by reducing manual inventory planning efforts.
- Improved customer satisfaction by ensuring the availability of desired products.
?? Common Pitfalls & How to Overcome Them
? Jumping to ML without a clear problem statement → Validate the business need before investing in ML.
? Ignoring stakeholder alignment → Ensure end-users (e.g., buyers, supply chain teams) understand and trust the model.
Step 2: Define the Objective and Metrics
Clear goals are your north star! Without them, it’s hard to know if ML is worth the investment. Define what success looks like:
- The model’s objective (e.g., predict demand, detect fraud, recommend products)
- Success metrics (e.g., accuracy, precision/recall, revenue impact)
?? Example: Setting Metrics for Demand Forecasting
Objective: Create a demand prediction model to guide the assortment planning process.
Metrics:
- Forecast accuracy measured by Mean Absolute Percentage Error (MAPE), aiming for MAPE < 10%.
- Revenue uplift of over 5% from optimized assortments in A/B testing.
- Stock efficiency improvements, reducing excess inventory by 15%.
?? Common Pitfalls & How to Overcome Them
? Focusing only on accuracy → Business impact (e.g., revenue, stock efficiency) matters more than model precision.
? Not aligning on success metrics → Ensure leadership and teams agree on what “good performance†looks like.
Step 3: Evaluate the Data
Garbage in, garbage out. No ML model can work well without high-quality data. As a PM, you do not need to clean data yourself, but you must ask the right questions and collaborate with data teams.
1. Identify Relevant Data Sources
Start by mapping out where the data is coming from:
- Internal Sources: CRM data, sales records, user behavior logs.
- External Sources: Market trends, weather APIs, social media sentiment.
- Real-Time vs. Historical Data: Is the data static (historical) or does it need real-time updates?
2. Assess Data Availability
Before committing to an ML project, confirm that you actually have enough data:
- How many historical records are available? (e.g., 3 years of transaction data)
- Are there gaps in key variables? (e.g., missing customer demographics)
- Are new products or categories missing historical sales data?
3. Evaluate Data Quality
Bad data leads to bad models. Work with data teams to check:
- Completeness: Are key features missing too many values?
- Consistency: Do product names, IDs, and customer records match across datasets?
- Accuracy: Are there obvious errors (e.g., negative prices, duplicate transactions)?
- Timeliness: Is the data up-to-date and reflective of current business conditions?
4. Analyze Data Distribution
Not all datasets are evenly distributed. Check for:
- Skewness: Are most data points concentrated in one range? (e.g., most products have low sales, but a few are bestsellers)
- Imbalance: Does one class dominate? (e.g., 95% of transactions are non-fraudulent)
- Seasonality: Does demand spike at certain times? (e.g., holiday shopping trends)
5. Detect and Mitigate Bias
Some features may unintentionally encode bias, leading to unfair models. Watch for:
- Proxy Bias: Features like ZIP codes may indirectly represent race or income.
- Underrepresented Groups: Does the dataset lack diversity in key categories?
- Fairness Metrics: Use tools like SHAP, LIME, or demographic parity checks to audit bias.
?? Example: Evaluating the Data for Demand Forecasting
Challenge: The company has three years of sales data but lacks real-time external factors like weather conditions or social media trends, which impact seasonal product demand.
Step 1: Identify Relevant Data Sources
- Historical Data: Sufficient past sales data exists (three years) for most product categories.
- External Factors Missing: No real-time market trend data, limiting model accuracy for seasonal demand.
Step 2: Assess Data Availability
- Historical Data Volume: The company has three years of transaction data.
- Data Gaps Identified:
- Newly Launched Products: No historical sales data, making forecasting harder.
- External Factors Missing: Seasonal demand drivers like weather, social sentiment, and market trends are absent.
Step 3: Evaluate Data Quality
- Completeness: Some product categories (especially new launches) lack sufficient historical data.
- Consistency: Product IDs, names, and transaction records must be standardized across datasets.
- Accuracy: No major anomalies detected, but duplicate transactions and incorrect pricing (e.g., negative prices) need review.
- Timeliness: Sales data is up-to-date, but external market trends lack real-time updates.
Step 4: Analyze Data Distribution
- Skewness: A few products dominate sales, while many have low demand. The forecasting model must ensure that it does not over-prioritize bestsellers at the expense of niche products.
- Imbalance: Some product categories have too little data (e.g., new products, niche fashion items), which could result in unreliable forecasts for these segments.
- Seasonality: Demand fluctuates based on season and trends. The model must incorporate external indicators like holiday sales spikes or sudden fashion trends.
领英推è
Step 5: Detect and Mitigate Bias
- Underrepresented Groups: The dataset may not reflect diverse customer segments (e.g., niche fashion styles may be overlooked).
- Market Bias: If only past sales trends are used, the model may fail to capture new fashion trends accurately.
Impact:
- Improved forecast accuracy by incorporating external data sources.
- Better demand predictions for new product categories, reducing overstocking risk.
- Minimized bias by auditing feature importance and adjusting dataset balance.
?? Common Pitfalls & How to Overcome Them
? Not validating data quality early → Ensure completeness, accuracy, and consistency before modeling.
? Ignoring seasonality and trends → Factor in external events (e.g., holidays, weather) that impact demand.
? Overlooking bias in data → Ensure Diverse Customer Representation
Step 4: Categorize the Problem Type
After you've checked out your data, the next step is picking the right ML model. As a PM, you won't need to code the model yourself, but you'll need to figure out which approach best fits the business problems.
- Supervised vs. Unsupervised Learning: If labeled data exists, use supervised learning (classification, regression). Otherwise, use unsupervised learning (clustering, anomaly detection).
Choose the Right Model Type : Ask yourself:
- Is the goal to predict a specific outcome (e.g., will a customer churn)? → Use Classification.
- Do you need to forecast a continuous value (e.g., next month’s revenue)? → Use Regression.
- Are you grouping similar entities (e.g., segmenting customers by behavior)? → Use Clustering.
- Do you need to identify patterns in time-sensitive data (e.g., predicting product demand fluctuations)? → Use Time Series models.
- Are you looking for outliers (e.g., fraud detection)? → Use Anomaly Detection.
- Do you need personalized recommendations (e.g., Netflix-style content suggestions)? → Use Recommendation Systems.
- Are you working with text, images, or large unstructured data? → Consider NLP, LLMs, or Computer Vision models.
?? Example: Categorizing Demand Forecasting Models
Challenges:
- Some products have rich historical sales data, making traditional time-series forecasting effective.
- Newly launched products lack historical data, requiring content-based or trend-based forecasting techniques.
ML Approach:
- Use traditional time series forecasting (e.g., ARIMA, XGBoost) for products with historical data.
- Apply alternative approaches (e.g., similarity-based methods) for new products.
?? Common Pitfalls & How to Overcome Them
? Choosing the wrong ML approach → Use business questions to guide model selection.
? Ignoring problem complexity → Start simple, then iterate if needed.
? Ignoring interpretability needs → If stakeholders need explanations, avoid black-box models.
Step 5: Collaborate with Data Scientists
Creating an ML model requires everyone to work together. As a PM, your job is to turn business needs into ML requirements and make sure everyone's on the same page so the model actually delivers.
Key collaboration strategies:
- Define success criteria together.
- Bridge the gap between business and ML by ensuring alignment.
- Work together on model selection and interpretability trade-offs.
- Communicate model results in business terms. (Instead of "the F1-score is 0.87," translate it into "our model correctly identifies 87% of returning customers.")
- Plan for continuous improvement.
??Example: Aligning Interpretability and Accuracy
Challenge: Leadership prefers explainable models, but data scientists suggest deep learning approaches that offer higher accuracy but lower interpretability.
Solution: Balance interpretability and accuracy by using explainable models where needed while leveraging advanced models when interpretability is less critical.
?? Common Pitfalls & How to Overcome Them
? Misaligned Expectations → Set success criteria upfront.
? PMs Overpromising to Leadership → Align feasibility with data science teams first.
? PMs Getting Too Technical → Focus on impact, not algorithms – let data scientists handle the model details.
? Assuming data scientists understand the business context →? Give data teams real-world use cases to fine-tune assumptions.
? Stakeholder Resistance to ML → Prioritize interpretability and business impact over complexity.
? Overcomplicated Models with No ROI → Keep models as simple as possible while meeting goals.
Step 6: Iterate and Test
An ML model is never truly "finished"—it needs continuous monitoring, testing, and refinement to stay relevant. Business environments change, customer behaviors shift, and models degrade over time.
Steps for iteration and testing:
- Train and validate the model with historical data.
- Deploy in a controlled environment before scaling.
- Monitor performance over time to detect drift.
- Iterate based on feedback and new business needs.
?? Example: Post-Deployment Monitoring
Challenge: Model accuracy declines over time due to changes in customer behavior and supply chain disruptions. Solution:
- Automate model monitoring to detect drift.
- Retrain the model periodically.
- Conduct region-specific A/B testing to account for market variations.
?? Common Pitfalls & How to Overcome Them
?? Deploying Too Quickly → Always validate the model in a controlled environment first.
? Ignoring Business Feedback → Ensure domain experts review model results.
? Overfitting to Historical Data → Regularly validate on new data.
? Ignoring model drift → Set up monitoring dashboards to track performance over time.
? Delaying retraining → Automate retraining triggers when key metrics degrade
Summary of the Framework
- Understand the Problem: Define the customer pain point.
- Define Objectives and Metrics: Specify success criteria.
- Evaluate the Data: Assess the availability and quality of relevant data.
- Categorize the Problem: Select the appropriate ML approach.
- Collaborate on Model Selection: Balance interpretability, accuracy, and business constraints.
- Iterate and Test: Use feedback loops to refine the solution.
?? Final Thoughts for Product Managers
This framework helps product managers drive machine learning initiatives with confidence by:
- Aligning ML with business goals.
- Communicating effectively with data teams.
- Avoiding overcomplicated models that do not deliver impact.
- Ensuring ML models continuously improve and drive revenue.
What has been your biggest challenge working with ML as a product manager?