Costing Product Features
GenAI - Microsoft Bing

Costing Product Features

It is a scenario you have seen or been in it yourself many times. A product manager pops around by your desk saying "we want to build feature X, here is some designs can we have a rough ballpark"? Of course the question is not directly about cost but more about how many sprints it will take for the team to build that feature (and that can be easily converted into cost).

In many ways, that is the easy part and yet consider about all the anecdotes of businesses getting that wrong to very wrong.

The harder part is if we ask the question "and how much will it cost us to operate that feature"? Focusing only on hard costs the company will have to pay to third-party services (computing, hosting, serving etc.) the answer is far from straightforward. Moving beyond the "it depends" answer is not a linear, short, or easy path.

The nature of the feature is important but let's try to stay on a generic plane. On that basis the operational costs of a feature would involve almost always the following dimensions:

  1. Computing
  2. Networking
  3. Storage

Things get even more interesting if we start decomposing each on potentials options thus the equivalent questions for product and/or engineering.

Computing

Starting with computing, do we expect the volume of users / requests to be covered by computing resources we already have in place (e.g. existing AWS EC2 instances) do we need to setup new infrastructure (before we even serve a single request) such as a new Kubernetes Cluster or are we using a serverless model where there is incremental cost per usage but not baseline cost?

How does the decision on computing affects the other dimensions prior to even serving users e.g. what is our logging and data retention policy - does it work for this new feature?

Computing may involve using other services too e.g. an OpenSearch cluster which again has some baseline costs which could be avoided if say the current relational database could be used instead.

Non functional requirements around computing such as availability and latency also influence networking by shaping a solution design with multiple availability zones.

Networking

It is unlikely product features requests directly dictate networking (apart perhaps from locations blocks and regions of presence); networking will probably be shaped the most by not functional requirements, availability, governance and constraints around services selected in a solution (e.g. not all AWS services are available in all regions, so if you intend to use a service in specific regions that will influence your networking decisions and your corresponding costs). The one area where product would influence networking (costs) massively is serving of assets (quality, size and caching)

Storage

Including databases in this dimension the questions that first form are what kind of databases (for example RDS vs DynamoDB vs. Neptune) and what model serverless or not. Let's not discount the very real possibility a solution uses more than one type of data storage plus a different caching solution.

Files / media both incoming and outgoing to a feature are relevant as well (if applicable to the feature in discussion) together with retention and archive policy.

The operational cost of a feature is only one side of the coin, the value of a feature is the other one.

Lets say for argument's shake that you could have a rather "generic" framework and the feature you have costed ensures you 90% completion of abandoned shopping baskets - isn't the value of the same feature different for a vendor with average basket of £10 and another one with £300 (stationary vs. furniture for example)?

Conversely isn't the value of a feature that allows a company to operate in Europe compliant with GDPR regulations a given if the company only has European presence?

Bottom line is that costing the operational cost of product features falls (mostly) in one of two categories:

  • The feature and the operational conditions are so simple that the effort of modelling costs outweigh the benefits of trying to predict it; a napkin calculation would do
  • The feature and/or the operational conditions are so complex that it is not possible to model costs to a reasonable degree of certainty with a napkin calculation and a detailed, iterative analysis is needed - if the business is not prepared to find out the hard way.

The "magic" seasoned professionals bring is knowing which category they are working with. As for a generic framework here is the one I prescribe:

  1. List all the product & user variables relevant to your product feature (no of users, sessions, DAU, peaks, min/,max/average request/response size, data retention etc. The list goes on and on and is feature dependent e.g. are you using a gen AI service? If so request size changes to number of tokens etc.)
  2. Design your solution and for enumerate- What service/component you use- Why you use it (link it back to requirement)- How (much) you use it (all this info from #1)
  3. Based on #1 and #2 generate sub costs for all components
  4. Identify which sub components drive the larger proportion of costs and challenge the "why" behind heir usage. Negotiate with product and evaluate if changing product parameters can result in changes to components and thus cost savings (there are always trade offs)
  5. Repeat till you reach an agreed solution and cost to take forward
  6. Build, launch, review costs and optimise (or adjust if your model was wrong)

At face value it may seem simple but step #2 is anything but. It is not uncommon for relative simple solutions to utilise 10+ Amazon services (or equivalent) and each can easily have 5+ cost dimensions, so you end up with a ~>50 line itemised feature cost.

As food for thought to instil the above thinking, and exercise to the reader here is one (simple?) feature to cost:

  • A Google query. User input in, the output we are all so familiar with out. How much does it cost to run one query?

Agree / disagree, do you have a different and better framework? I would love to hear your thoughts in the comments.

Cai Parry-Jones

Data Engineer @ Revolut | SQL | Python | Databases | Big Data

1 年

Great article Vas! I'd add another operational cost to consider: labour. For example, devops engineers for maintaining the new infrastructure, QA engineers to ensure the feature remains bug-free, and data teams for BI reporting. The extra work for each team may not be huge, but if a team is near full capacity it may be the final straw that forces them to make a new hire.

要查看或添加评论,请登录

Vasileios Fasoulas的更多文章

社区洞察

其他会员也浏览了