Challenges Faced while Building Personalisation Engines
Aayush Agrawal
Co-Founder | Competitor Intelligence for Gaming | Fraud Detection for FSI | Gen AI Automation for Law Firms | Outcomes as a Service using AI
This is Part 2/5 in a series of posts related to building Recommender Systems. Personalization via Recommender Systems has never been more important for businesses which want to acquire new customers and retain the existing ones.
Accenture reports that 9 out of 10 customers are more likely to shop with brands that provide relevant offers & recommendations!
In our previous article , we looked at what recommender systems are and how you can build one for your business. In this post, let’s go through some of the major challenges faced by teams while building recommender systems. Some of them are business oriented while the others are technical in nature.
Common Challenges faced while building Recommender Systems and their solutions are discussed below:
Now let's dive into these challenges in detail
1. Capital Investment
Recommender Systems are a big investment in terms of time, money and expertise needed to build an effective solution. It could take a team anywhere from 2-6 months to build a recommendation engine, depending on the complexity of the solution. Plus the journey is not over even when the engine is active, it will need constant monitoring & refinement resulting in continual costs. You'll need the support of Data Engineers, Data Scientists / Machine Learning Engineers to develop & deploy the solution. Assuming Avg. salaries of ~$125K per year per person for a team which has 4 people for development for 6 months and then 2 people for maintenance/enhancement for the next 6 months, you are looking at a cost of $375K.
There are infrastructure, compute and data related costs on top of these. Typically these systems could end up costing anywhere from $500k - $2M per year for an enterprise based on scale & complexity.
2. Development Time
While real-time recommender systems offer many advantages, they are that much more difficult to build and maintain. Many teams start with batch processing as their choice for the first few experiments which enables them to test the technology out and iteratively keep building and experimenting until they start using industry standard stuff like Two-Tower Approach etc. These systems also should get better over time as we collect more data and evolve to more complex frameworks. We need to constantly retrain & calibrate these models to ensure that we are serving optimal results. This constant evolution poses very big challenges in maintenance & productionisation when operating at a large scale. We will discuss these steps in detail in this subsequent blog post. Such a system involves development of various complex sub-components, each of which takes days or weeks to complete. For instance:
3. Complex Integration Process
The process of integrating a recommender system for custom solution vs an off-the-shelf product differs highly.
If you are buying a product off the shelf, it could be a very complex affair due to a number of reasons like insufficient domain specific customization, incompatible UI/UX, backend compatibility issues, scalability issues, lack of explainability & control etc. Not only will your data team need to do a lot of work to ensure your data is in a specified format but you won't be able to customise the recommendations which will prohibit you from unlocking the true potential of the system.
On the other hand, deploying a custom made system as a microservice (like via APIs) is not easy. Your team will have to develop, deploy & maintain each and every part of the system in your environment. In this case, productionizing the ML System is difficult as compared to consuming the output of the application.
4. Lack of Data
Good Recommender systems analyse item data & customer’s behavioural data to find similar patterns and trends which enable them to suggest relevant products. AI & ML systems thrive on data, the more data we have, the better the outcome will be.
领英推荐
5. Change in Behavioral Trends
Constant changes in user preferences & the business in itself requires the algorithm as well as the data to keep up. Recency of data is a very important factor in order to deal with changing patterns, which is why a lot of Tech Startups these days opt for Real-Time recommendations instead of Batch processing. In many cases, constant auto-training of models in real time is important to serve relevant recommendations. TikTok’s Monolith is a very good example of a system which is dynamic in nature.
6. Bias
Different types of cognitive bias play a very important role in how the recommendations impact conversions. The system should take into account how marketing, brand recognition & discounts will affect a customer’s choice. For example, if there are 2 plain black shirts of similar material, one of them is priced at $75 while the other one was initially $100 but after a discount of 25% is also now priced at $75. It’s highly likely that most of the customers would choose the 2nd product (the discounted one) because the customers perceive products which are priced higher as better or more luxurious.
It's very important to understand and train the model to perform well not just generally but also in cases where biases might impact a user's decision. Exposing the model to more data & stress testing the results will enable your team to ensure that the effect of bias is mitigated. You can also leverage Synthetic Data Generation or Collect more Data by rolling out A/B tests in order to understand it's effects better and learn from it.
7. Privacy Concerns
The more the algorithm knows about the customer, the better it will perform in terms of accuracy. Customers are, however, not comfortable sharing personal information given the risk of data breaches. It is therefore important to balance the needs of data hungry algorithms wrt privacy and safety concerns. These days, a lot of teams work with data which is not personally identifiable, which alleviates a lot of risk but also degrades the performance a little.
8. Scalability
If the need dictates, a good recommender system should be able to handle very large datasets having millions of users and items. Various music streaming platforms, OTT Apps, Gaming platforms, Retailers have use-cases which mandate building reliable systems which can operate at this scale. Not a lot of teams & companies in the world have the required technical expertise and skill to build these systems reliably. These teams either resort? to either using distributed computing frameworks like Apache Spark/Hadoop or they precompute recommendations and cache them to support cases like frequently viewed items or popular categories etc. to speed up the process. Companies like Tiktok have been experimenting with Collisionless embedding tables, expirable embeddings and frequency filtering to reduce its memory consumption, production ready online training architecture & fault tolerant systems, which has yielded them huge gains.
9. Overfitting?
This is a common challenge in Machine Learning. This happens when the model learns to fit the training data very closely but is not able to generalise well on unseen data. This could lead to recommending only very popular items or over-recommending items that users have already interacted with. Data Scientists commonly use regularisation techniques to penalise large model weights & prevent over-fitting. Cross-validation of a model's performance on new data is also a way of testing whether a model is overfit or not.
10. Diversity
In cases where users might want to discover new & exciting items not just the popular ones like on Spotify or Netflix, recommender systems need to maintain a trade-off between accuracy (recall) and diversity metrics like entropy and novelty. In some cases, specially in music recommendation systems, serendipity-based recommendations, which recommends unexpected items that are relevant to user’s interests (like unpopular songs with similar musical characteristics to the user’s favourite music) also works wonders.
Yugen.ai can help your organization overcome these challenges discussed above and implement a recommender system based personalization strategy. We are a team of engineers & Data Scientists who are not just driven by passion but we provide an integrated offering of our services coupled with our own ML Platform which helps us expedite your journey to ROI!
Our team at Yugen.ai has been working with clients in the field of Retail & AdTech Space to help them personalize their customer’s journeys. In most cases, the journey starts with simpler systems which are refreshed at a lower frequency, but as we start seeing ROI, we move towards more complex architectures & real time frameworks. All these initiatives, coupled with good execution, helped us increase the monetization rate of some products by as much as 50%.
If you are looking to solve similar problems in your domain or are looking to identify how AI can help your team get closer to your goals, reach out to me at [email protected] and let’s have a chat! Please share your insights and thoughts in the comments below!