AI/Machine Learning and contextual personalization
This article introduces Amazon Personalize a fully-managed Machine Learning service that supports use cases that require contextual personalization and recommendations. The service enables businesses to add performant, real-time personalization and recommendations to their applications, with a few API calls.
Since Amazon Personalize is a fully-managed service, businesses can focus on their use case and do not need to worry about the hardware, data pipeline, or machine learning workflows which are all handled by Amazon Web Services (AWS).
This service is built on deep learning algorithms and leverages recipes that combine the best in class learning from two decades of personalization and recommendation systems at Amazon.com. Three examples of businesses who have built competitive differentiation based on automated user personalization beyond Amazon.com's pioneering work in 1998 include Netflix (Top-10 household recommendations), Spotify (Discover Weekly playlist), and Youtube (What to watch next).
Personalization at scale for digital content requires machine learning regardless of whether direct ROI is being driven or the capability is deployed to delight the user. Beyond the delight, general business benefits from contextual personalization powering recommendation impact both cost savings and incremental revenue. When businesses continuously calibrate their offers to their customer's preferences the attrition potential decreases leading to increased retention. Businesses have the potential to increase their revenue by:
1) adding matching-product recommendations to their customer's purchase confirmation
2) sharing other customer views ("trending items you might like"), and
3) understanding contextual likes and dislikes (ad/offer suppression and generation based on view into abandoned shopping cart).
As discussed in Recommender Systems : The Textbook, in order to achieve the broader business-centric goal of increasing revenue, the common operational and technical goals of recommender systems are as follows: Relevance: The most obvious operational goal of a recommender system is to recommend items that are relevant to the user at hand. Novelty: This capability is truly helpful when the recommended item is something that the user has not seen in the past. Serendipity: Wherein the items recommended are unexpected. Diversity: Recommended list contains items of different types
Working with the service: Before using Amazon Personalize, you must have an Amazon Web Services (AWS) account. Once you have an AWS account, you can access Amazon Personalize through the Amazon Personalize console, the AWS Command Line Interface (AWS CLI), or the AWS SDKs. Amazon Personalize consists of three related components:
- Amazon Personalize — Use this to create, manage, and deploy solutions
- Amazon Personalize Events — Use this to record user events for further training of solutions. You can use the Amazon Personalize console to get event ingestion code that you can use to record events. The event ingestion SDK includes a JavaScript library for recording events from web client applications. It also includes a library for recording events in server code. When you record an event, Amazon Personalize uses it to update the associated solution.
- Amazon Personalize Runtime — Use this to get recommendations from a campaign (deployed solution). For more information, see Getting Recommendations.
Importing data: Customers can import training data into Amazon Personalize by creating a dataset with the AWS console or by using the AWS SDK. Datasets are created within a dataset group that contains data for related solutions. Amazon Personalize recognizes three types of datasets. Each dataset type has an associated schema. Each dataset has a set of required fields and reserved keywords.
? User — This dataset is intended to provide metadata about users. This includes information such as age, gender, and loyalty membership, among others, which can be important signals in personalization systems.
? Item — This dataset is intended to provide metadata about your items. This includes information such as price, SKU type, and availability, among others.
? User-Item Interactions — This dataset is intended to provide the interactions between users and items along with the type of interaction and the timestamp. Each record corresponds to an event type, such as click, watch, and add to cart. Of the 3 types of datasets, only User-Item Interactions dataset is required.
Recipes, solutions, metrics, and deployement: Once data is input to a dataset, it can be used to create and train a model known as a solution. A solution is a personalization model trained on the data you provide in your dataset. The model is trained using a recipe. A recipe is an algorithm and data processing steps that allows recommendations to be made from the input data. Amazon Personalize supports a number of in-built recipes. Amazon Personalize can automatically choose the most appropriate recipe based on its analysis of the training data. Alternatively, customers can choose which recipe to train the model on. Each recipe has its own use case and customers should choose the recipe that best fits their needs.After the solution is trained, the metrics created during training can be evaluated. The metrics give an indication of the performance of the solution. The console shows the metrics and allows retraining the solution, as needed. A deployed solution is able to make recommendations for users. To deploy a solution, customers create a campaign in the console or throught the API calling CreateCampaign. After customers create a campaign, they are able get recommendations. Amazon Personalize provides two operations:
- GetRecommendations returns, for a specified user and item, a list of recommended items based on the model trained. For example, movies can be recommended for users signed in to a website.
- PersonalizeRanking re-ranks a list of recommended items to target a specific user.
For test purposes, customers can use the console to get recommendations from campaigns. For monitoring purposes Amazon Personalize is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in Amazon Personalize. CloudTrail captures a subset of API calls for Amazon Personalize as events, including calls from the Amazon Personalize console and from code calls to the Amazon Personalize APIs.
Practical issues addressed: Customers can benefit from the wide range of available algorithms designed to overcome common problems, such as:
- cold starts - 3 examples - 1) new community refers to the start-up of the recommender, when, although a catalogue of items might exist, almost no users are present and the lack of user interaction makes very hard to provide reliable recommendations; 2) new item: a new item is added to the system, it might have some content information but no interactions are present; and 3) new user: a new user registers and has not provided any interaction yet, therefore it is not possible to provide personalized recommendations
- popularity biases - popular items are recommended frequently and less popular ones rarely, if at all. However, less popular, long-tail items are precisely those that are often desirable recommendations.
- sparse datasets - Real world datasets are oftensparse. For example As of March 2015, creators filming in YouTube Spaces have produced over 10,000 videos which have generated over 1 billion views, YouTube also has more that 1 billion users. So, in this case, an estimation of sparsity will be 0.0001. However Youtube Spaces is only a fraction of the whole site, the actual amount of videos is more than a billion. So, if an average user watches about 1000 videos (there are many users that haven’t watched anything), sparsity plummets at 0.00001.
- evolving intent - The requirement to quantify the changing intent of the user in engaging the recommendation process.
ML experience is not required to use Amazon Personalize! The service supports easy integration into applications by app developers - no ML experience required. Specifically, with AutoML, Amazon Personalize will automatically select the best algorithm and tuning to create a custom-trained model based on the data provided. Amazon Personalize knows which algorithms will work best for different catalog sizes and item characteristics based on comparative learning across Amazon music and video. By automatically experimenting with multiple models and choosing the best one, Amazon Personalize ensures that the most effective personalization model is deployed for the specific customer use case. Additionally, this approach enables good default hyper-parameters for a large range of application scenarios.
Building a personalization system that generates effective, real-time, contextual recommendations involves several hard technology challenges including setting up robust data pipelines for catalog, user and user activity stream data, experimenting with and hand tuning different algorithms, running multiple A/B experiments and serving recommendations at scale with low latency. Amazon Personalize automates all of these steps, allowing customers to deploy systems automatically. It examines customer datasets, automatically picks the right algorithm and creates a custom model that is tailor-made for each customer. Behind the scenes, Amazon Personalize automatically does the right data transformations, picks training and test data sets, trains and tunes multiple models and chooses the best one, i.e. the one that optimizes for customer defined goals such as increased click-through rate, time spent or conversions. Each of these steps are designed based on all the learnings from personalization and recommender systems that have been deployed across Amazon.
Amazon Personalize exemplifies the company's commitment to invent and simplify working backwards from customer requests. It is aligned with the drive to take technology that was only in reach of a small number of well-funded organizations and make it as broadly distributed as possible. Amazon Web Services has done that successfully with computing, storage, analytics, and databases and is taking the exact same approach with machine learning - and in this specific case simplifying incorporation of contextual personalization and recommendation capability.
BTW, if you would like to try Amazon Personalize with the popular MovieLens dataset example- please see here.
About the Author: Madhu cherishes the opportunity to learn and collaborate. Note that what is expressed by Madhu here is of his own interest and is in no way reflective of his employer. Please reach out to Madhu either privately or by posting your comments below.