How Many Segments in Latent Class MNL for CBC or MaxDiff Data?
Choice-Based Conjoint (CBC) and MaxDiff (best-worst scaling) are excellent tools for needs-based or attitudinal-based segmentations in survey research. They avoid scale-use bias, which plagues traditional rating scales (allowing for more accurate comparisons, such as between Germans and Brazilians, or between people with low rating tendencies vs. high rating tendencies within the same country). These methods also capture more information (signal) from respondents, as choice-based tasks encourage greater cognitive engagement than standard rating questions.
Now that you’ve run a latent class MNL analysis on either CBC or MaxDiff data, how do you determine the optimal segmentation solution? Optimal in terms of what? Should you choose a 2-group solution? A 4-group solution?
Recommendations for Practitioners
- Clean Your Data First – Conduct data cleaning and consistency checks, removing randomly behaving respondents. Otherwise, a latent class (segment) may emerge that consists primarily of these “bad†respondents, showing poor fit and little variation across their raw latent class scores.
- Business Needs Take Priority – Choose a segmentation that helps solve the client’s business problem. A 2-group solution is often too simplistic for most segmentation strategies, while a 7+ group solution may be too complex for an organization to grasp and implement. This often leads us to examine solutions in the range of 3-6 groups.
- Use BIC as a Guide, Not a Rule – The Bayesian Information Criterion (BIC) is one of the most commonly used fit statistics for evaluating model performance—lower BIC values indicate better fit. However, client needs and interpretability often outweigh statistical fit. Even if a 4-group solution has a better (lower) BIC score, the client’s needs may be better served by a 3- or 5-group solution.
- Analyze Segment Differences – Assign respondents to the group they are most likely to belong to using Latent Class MNL (this happens automatically in Sawtooth’s program). This allows you to create a new segment membership variable and use it as a banner (column) variable in your cross-tab analysis. Examine how segments respond to survey questions to identify meaningful managerial differences. Consider which segments are larger, more profitable, more easily targeted by your client’s strengths/messaging, or more likely to switch from a competitor’s product (often identified using a conjoint simulator).
This activity can take a lot of time/effort! Researchers from Radius will be showing how AI can help with the heavy lifting in evaluating different segmentation solutions, based on researcher aims and client needs at the May 7-9, 2025 Sawtooth Research Conference in New Orleans, LA, USA. To see the full program or to register, visit: https://events.sawtoothsoftware.com/conference/2025/agenda
A Meta-Analysis of BIC Scores for CBC Studies
I recently analyzed 15 real-world CBC datasets, recording BIC statistics for 1-10 segment solutions (a 1-segment solution is equivalent to an aggregate logit analysis). While this meta-analysis did not include MaxDiff data, the results should still be applicable.
The average BIC curve across the 15 datasets followed this pattern:
Most datasets showed improving BIC values (lower is better) up to around 7 segments, after which the BIC scores worsened. On average, the BIC statistic suggests that 7 segments might be “optimal†in a purely statistical sense. However, for guiding market segmentation strategies, a 7-segment solution may be impractical. In our experience, clients typically benefit more from 3-6 group solutions.
领英推è
Some BIC curves strongly suggest an optimal number of segments, such as for the D95082 dataset, where the best BIC score appears at 3 segments:
In contrast, other datasets, such as the Cessna1 dataset, do not show a clear "best" number of segments based on BIC alone (at least up to 10 segments):
The BIC score never bottoms out (reverses) within the 1-10 segment range.
Final Thoughts on BIC and Segmentation
BIC should be used as a flexible guideline rather than a strict rule for determining the number of segments. We often look for “elbows†in the BIC curve—points where the improvement in fit slows significantly. However, client needs and the ability to derive a compelling story from the data should take precedence over BIC scores.
Additionally, we want latent class solutions to be stable. By default, Sawtooth’s software tries five different random starting points and selects the best-fitting solution. This approach reduces the risk of settling on a suboptimal, non-reproducible segmentation.
Conclusion
Choosing the optimal number of segments in a latent class MNL analysis of CBC or MaxDiff data is both a statistical and strategic decision. While BIC provides valuable guidance, it should not be the sole determinant. The practical utility of the segmentation for the client’s business problem and the interpretability of the results are crucial, often in our experience leading to solutions in the range of 3-6 segments. Ultimately, the best segmentation is one that not only reflects the data structure but also delivers actionable insights that drive strategic decision-making.
End notes: Many thanks to Keith Chrzan who heads Sawtooth's analytical consulting division for his suggestions on this little article. If the goal of the latent class MNL is to maximize predictive accuracy, solutions with greater complexity (many more classes) often work better. Here, we assume that interpretability and guiding segmentation strategy carry more weight than predictive accuracy. But, you can have both: for predictive accuracy, we’d recommend building market simulators using individual-level MNL models such as with mixed logit or HB-MNL. The Latent Class segments become filters (segmentation variables) within the market simulator built using individual-level estimates.
Marketing Research Consultant- RETIRED!
3 周Hey Bryan, thanks for the good work. But here is a strongly held belief of mine that will never change, the hill I will die on: There can never be a statistical diagnostic that tells you the right number of segments. The right number of segments is and always will be driven by your marketing budget and strategy, the number of unique customer profiles you have the bandwidth to effectively target, serve, and satisfy.
Helpful!