Recommendation systems

Recommendation systems

Recommendation systems play an important role in e-commerce, banking, and other businesses nowadays. Based on the historical data of the client's transactions, the recommendation system will offer to a client a specific product, which will be purchased with a high probability. This approach will increase sales. There are different algorithms that exist. For instance, collaborative, content-based and hybrid recommendation engines are widely used today. A competitive solution is based on an affinity analysis with an applied apriori algorithm, which was used in this work.

Before we start with apriori we want to cluster our clients based on their purchase behavior. Imagine we have a dataset that contains information about each transaction such that transaction id, client id, product id, amount of a transaction. Based on this data we want to define Recency, Frequency and Monetary parameters which describe our customers. Then, applying classical segmentation algorithms, such as K-means, for instance, we can obtain clusters for our clients as it is shown below. We can clearly identify 3 different clusters.

No alt text provided for this image

Clusterization is an effective way to learn about clients and their behavior. Clusterization can be performed based on static or dynamic features. Therefore, it can be also a mix of static and dynamic parameters. Static variables are age, gender, location, married status, children, while dynamic features are transactions in a bank, purchases on online stores, for example. In addition, we can obtain their derivatives such as recency, frequency, and monetary parameters. This will help us to capture those clients who is active, loyal, a new one or those customers whom the business is about to lose and they need attention such as a call, sms, email or push notifications.

On the next step, we want to define so-called Rules based on the monthly market basket of each client. Applying the apriori algorithm we can obtain dependency between antecedent, what a client is purchasing right now, and consequent, what we can offer to the client based on historical data. Such dependencies look like the table below:

No alt text provided for this image

So, for clusters from cl_1 up to cl_9 in this case, we defined rules between antecedent and consequent with corresponding probabilities. Now, knowing which cluster our client belongs to and what product she or he buys as an antecedent we can offer a corresponding consequent that will be purchased with a probability defined as confidence in the table.

To sum up, two important data science problems have been described above. The clusterization is the first which helps to the "business" to better understand their clients and to keep them active. On the other hand, the recommendation system which helps to increase sales by offering additional products to already purchased one.

要查看或添加评论,请登录

Boris Kushnarev的更多文章

社区洞察

其他会员也浏览了