Determining the Right Sample Size for Reliable Validation of Your SaaS Business Model
Aleksei Kozionov
SaaS products for manufacturing | Value from Data & AI | PhD in Computer Science
Intoduction
In this article, I describe the process of determining the sample size for interviews and experiments used in validating SaaS business models, and the implications of sample size on associated risks.
In today’s increasingly data-driven business landscape, the validation of SaaS B2B business models remains reliant on traditional tools such as customer interviews and experiments (Talking to Humans , Testing with Humans ) especially at early stages of validation of initial idea / business model. While low- and high-fidelity prototyping and data insights support this process in later stages, interviews and deep experiments with customers remain indispensable (Hypothesis-Driven Entrepreneurship: The Lean Startup).
The rationale behind the predominance of interviews and experiments in B2B SaaS validation is clear: the quest for repeatable and scalable patterns amidst the complexity of business problems and processes, compounded by the unique characteristics of each enterprise. Only through direct engagement with process owners, users, and decision-makers can the necessary insights be collected.
Despite their importance, determining the requisite number of experiments with customers (or interviews) can be a challenge even for seasoned Product Managers. Let’s take an example, we are considering a market of, say, 100 production plants. The question arises: How many confirmations are needed to validate the business hypothesis? Surprisingly, the answers vary widely among Product Managers.
In response to this ambiguity, I propose a short guide that draws upon my expertise in statistics, experience in conducting A/B tests for ML systems, and track record in validating and investing in B2B SaaS products. This guide will outline a methodology for determining the sample size necessary for Product Managers to validate their business hypotheses.
By bridging the gap between theoretical statistical concepts and practical application in the realm of B2B SaaS validation, this guide aims to empower Product Managers to make informed decisions and drive successful outcomes for their businesses.
Part #1: fundamentals of statistics for sample size selection
Let’s take a simple example. Imagine, that you are a doctor residing in a small village of 500 inhabitants, you’ve discovered the spread of a disease impacting all demographics. Suppose you estimate that around 30% of the population is affected. To confirm this with 80% confidence and a margin of error of ±5%, you’ll need to calculate the necessary sample size.
Thanks to the statistics, this problem was solved ages ago. You can find more details here on selection of sample size and justification quality metrics for experiments (sample size justification ), and here with a strong focus on A/B-test , even the sample size calculator
To estimate the disease’s prevalence accurately, you utilize a proportion sample size formula that considers the total population size (N), the estimated proportion affected (p), the desired confidence level (Z-score), and the margin of error (E). Specifically:
where:
This formula assumes an infinite population. However, because the village population is finite (500 people), we need to use the finite population correction formula:
where
Let’s calculate the required sample size.
To have an 80% confidence that the prevalence of the disease in your village is 30% ± 5%, you should test approximately 108 people from the population of 500. This sample size will give you a statistically reliable estimate within the specified margin of error.
At the end to determine sample size n we need to know the following parameters
Now let’s transform this simple case to the case of business model validation for the business model we are considering for investing.
Part #2 will explore the application of these principles to the unique context of B2B SaaS validation
Assume we are evaluating a business model for a SaaS product tailored for Automotive OEMs that manufacture passenger vehicles with an annual output exceeding 20,000 units. Our initial business operations will be launched in Germany.
The primary objective of this validation is to ascertain whether the customer needs, which our SaaS product intends to meet, actually exist within this target market. This scenario mirrors the earlier example involving disease prevalence; here, however, the ‘condition’ we’re assessing is the presence of a specific business need among potential customers.
To determine the appropriate sample size for this validation, and to apply the statistical formula we’ve discussed, it is essential to accurately define the following three parameters:
By meticulously specifying these parameters, we can employ the adjusted sample size formula to ensure our business model validation is both statistically reliable and relevant to our market entry strategy in Germany.
Determining the Appropriate Market Estimation for Sample Size
When preparing to validate a business model, especially in specialized sectors such as SaaS for Automotive OEMs, one must first decide which market estimation to consider for the basis of population size. The terms TAM, SAM, and SOM are pivotal in this decision-making process:
For the purpose of validating a SaaS business model for Automotive OEMs, the most relevant metric is the SAM. This measure accurately reflects the subset of the market that not only has the need for the product but is also within the operational reach of the business, thus providing a practical target for initial market entry and hypothesis testing. Unlike SOM, SAM is not limited by current market share or competitive dynamics, which might unduly narrow the scope of potential customers without reflecting the full extent of market needs.
Applying SAM to Our Scenario: In our specific case, the SAM consists of passenger vehicle production plants operated by Automotive OEMs within the DACH region (Germany, Austria, and Switzerland), supplemented by an additional 10% from other parts of Europe to account for broader market engagement. This effectively places our target population size at around? 50 plants. This figure may vary slightly based on brand representation and specific geographical focus but serves as a solid basis for conducting a statistically valid market validation.
Determining Population Proportion and Margin of Error for Customer Need Validation
Estimating Customer Need: The crucial question in validating our SaaS business model is to estimate the proportion of the target population that exhibits the specific business need our product addresses. The most robust method to ascertain this is through a detailed analytical estimation, which involves understanding the intricacies of the customer’s problems and how they align with our solution. An alternative, more straightforward approach is to utilize established B2B sales benchmarks to gauge customer interest during the sales process, specifically looking at the conversion rates at critical stages.
Sales Process and Lead Qualification: In our sales funnel, a pivotal point of interest is the transition of leads to the Sales Qualified Lead (SQL) stage. At this stage, a lead is considered qualified based on certain criteria, primarily the presence of a need and the authority to make purchasing decisions, although not necessarily having confirmed budget and timing (commonly referred to by the acronym BANT: Budget, Authority, Need, Time). The subsequent opportunity stage focuses more on our ability to close the sale rather than on the customer’s initial need, which is the primary focus of our hypothesis validation.
Using Sales Benchmarks to Estimate Need: According to SaaS sales benchmarks, the progression from SQL to an actual sales opportunity (Opp) typically occurs with a probability of 30–40%. For the purpose of our validation, we can reasonably set the estimated population proportion (where the hypothesis of customer need is positive) at an average of 35%. This figure strikes a balance between the lower and upper bounds of our observed conversion rates, providing a realistic estimate of the market segment that could potentially benefit from our product.
领英推荐
The sales process stages and the relevant one (taken from here: https://firstpagesage.com/seo-blog/b2b-saas-funnel-conversion-benchmarks-fc/ )
Setting the Margin of Error: To accommodate potential variances in our estimation and ensure robustness in our business model validation, we set a margin of error at 5%. This allows us to account for uncertainties and fluctuations in customer behavior and market conditions without significantly compromising the accuracy of our validation process.
Estimating sample size
To confirm the customer need hypothesis with different levels of statistical confidence and precision, we apply the previously discussed formula for sample size in proportion studies. Below are the results for varying confidence levels and margins of error, based on the assumptions specified for our customer validation scenario:
1a) 80% Confidence:
1b) 70% Confidence:
2. Low precision (high error margin = 10%):
2a) 80% Confidence:
2b) 70% Confidence:
These calculations provide a robust framework for determining how many customers you should validate with to have a statistically reliable estimate, guiding investment decisions and product development strategies with empirical evidence.
Let’s summarize in the table
In statistical sample size calculations, the margin of error (E) and the confidence level (reflected in the Z-score) are both crucial factors that influence the required number of observations. However, it is observed that changes in the margin of error tend to have a more pronounced impact on the sample size compared to changes in the confidence level. This effect is due to the mathematical formulation used in determining sample sizes for population estimates:
For example, decreasing the margin of error from 10% to 5% more than quadruples the required sample size, assuming all other factors remain constant. In contrast, increasing the confidence level from 70% to 80%, which involves a smaller relative change in the Z-score, results in a less substantial increase in the required sample size.
This analysis demonstrates why precision (as dictated by the margin of error) is a more sensitive parameter in sample size calculations than confidence (as adjusted by the Z-score). Understanding this dynamic is crucial for researchers and analysts when designing studies or validation tests to achieve both practical and statistically robust outcomes.
Part #3: Practical Usage of Sample Size Calculations
Understanding Risk and Decision-Making Implications
When we apply the 70% confidence level with a 10% margin of error, practical usage suggests that interviewing 17 customers might be sufficient to gauge if the need is relevant for between 25% and 45% of the potential customer base. This broader range implies a higher risk compared to a more precise approach:
Strategic Implications for Business Decisions
The choice between these two sampling strategies?—?low precision versus high precision?—?depends on the specific goals of the validation effort:
Empirical guidance for Product Managers
Below, I propose an empirical framework that suggests the number of interviews needed for both low and high precision validations, tailored to the customer segment size:
What comes?next?
Next articles I will write about building up SaaS business cases, Unit Economics and their metrics and how to validate them.
PS
Thanks to Simon Weiss and Felix Hajdinjak for reviewing the article!
References