登录查看更多内容

BxD Primer Series: Matrix Factorization Recommendation Models

Mayank K.

Founding Partner - BUSINESS x DATA

发布日期: 2023年5月25日

Hey there ??

Welcome to BxD Primer Series where we are covering topics such as Machine learning models, Neural Nets, GPT, Ensemble models, Hyper-automation in ‘one-post-one-topic’ format. Today’s post is on?Matrix Factorization Recommendation Models. Let’s get started:

The What:

Matrix factorization is a model-based technique used in recommendation systems to uncover latent factors that underlie the observed user-item interactions in a dataset (check our post on building user-item interactions matrix?here).

Factorization process involves breaking down the user-item matrix into two separate matrices: a user matrix and an item matrix. Each row of user matrix represents a user, and each column represents a latent factor. Each row of item matrix represents an item, and each column represents a latent factor. The dot product of user vector and item vector gives an estimate of user's rating for the item.

Matrix factorization models are typically trained on a dataset of user-item interactions, such as ratings or clicks. Goal is to learn the factor matrices that minimize the error between predicted and actual ratings in training data. Once model is trained, it can be used to make personalized recommendations to users by predicting their ratings for items they haven't yet interacted with.

The How:

Below are the general steps involved in training a matrix factorization model:

Data Preparation: This typically involves extracting user-item interactions and any additional information about users and items that may be relevant for recommendation task.
Data Splitting: Split the data into training and testing sets. Training set is used to train the model, while testing set is used to evaluate its performance.
Model Type Selection: Choose a matrix factorization algorithm and a suitable cost function for the task. There are several options to choose from, as described in later section.
Model Training: Train the model using training set. This involves optimizing chosen cost function using an optimization algorithm such as SGD, Conjugate Gradient, Adam, L-BFGS or other as appropriate.
Hyper-parameter Tuning: Hyper-parameters are set before training the model, such as the learning rate, number of factors, and regularization strength. Select the values that result in best performance on validation set.

Note: Measuring recommendation system performance is already covered in?this?post.

Matrix Factorization Algorithms:

We have explained four major matrix factorization methods in this section:

??Singular Value Decomposition (SVD)

In a rating matrix?R?of size?m*n, where?m?is number of users and?n?is number of items, each entry r_ij represents the rating of user i for item j.

SVD decomposes matrix?R?into three matrices: U, Σ, V, such that R=UΣV^T

U represents user factors, V represents item factors, and Σ represents the strengths of factors. Dimension of matrix Σ is a parameter?k?that need to be optimized using a cost function and optimization technique.

Rating r_ij for user?i?and item?j?is predicted as follows:

Where,

u_if is the i’th row of U
v_jf is the j’th row of V
σ_f?is the f’th diagonal element of Σ
Parameter?k?is number of latent factors used in factorization

SVD may not be well-suited for data with missing values or data that is highly sparse. Many times,?k?value it selected empirically based on data but using Mean Squared Error (MSE) as cost function and SGD as optimization technique is a better option.

??Alternating Least Squares (ALS)

ALS factorize a sparse user-item interaction matrix?R?into two lower-dimensional matrices, a user latent feature matrix?U?(mk) and an item latent feature matrix?V?(nk). It does not require any additional optimization technique and uses mean squared error as default cost function.

It works by iteratively solving for either U or V while fixing the other matrix.

Fix?V, solve for?U:

Where,

r_i is the row vector of user?i?in rating matrix R
U_{i,:} is the row vector of user?i?in the latent feature matrix U
lambda_U is a regularization parameter that controls the complexity of model

Fix?U, solve for?V:

领英推荐

You, Me and Bayesian Neural Networks (BNNs)

Dean Harries 2 个月前

The Math Behind Perceptron: A Step-by-Step Guide to…

Sharat Manikonda 9 个月前

6 Core steps for choosing a ML Model

Jitender Malik 3 周前

Where,

r_j is the column vector of item j in rating matrix R
V_{:,j} is the column vector of item j in latent feature matrix V
lambda_V is a regularization parameter that controls the complexity of model

ALS is effective for sparse datasets and is computationally efficient.

Note: If a constraint that all elements of U and V can only be positive or set to zero, then ALS becomes?Non-negative Matrix Factorization (NMF),?which has use cases in models where interpretability of values is important, scenarios such as image processing or text analysis.

??Probabilistic Matrix Factorization (PMF)

In PMF, each entry of user-item interaction matrix is modeled as a Gaussian distribution with a mean and variance that depend on latent factors. In other words, instead of modeling the user-item interaction matrix as a deterministic matrix, PMF models each entry as a probabilistic distribution. This allows PMF to provide a more accurate estimate of the probability that a user will interact with an item they have not yet seen.

For a mathematical understanding of this concept, read our previous edition on Bayes models?here?and watch this video on Coursera?here.

In summary, the steps for PMF are as follows:

Define the likelihood function based on observed ratings.
Define the prior distribution over model parameters.
Estimate the model parameters by maximizing posterior distribution.
Use estimated parameters to predict ratings for missing entries in rating matrix.

??Bayesian Personalized Ranking Matrix Factorization (BPRMF)

BPRMF is recommended for use with binary preference data, such as like/dislike or click/no-click data. It is not suitable for use with explicit rating data, where the goal is to predict a numerical rating score.

Check comprehensive paper explaining the mathematics of BPRMF?here.

In summary, the steps for BPRMF are as follows:

Randomly initialize user and item latent vectors.
For each positive user-item interaction in the training data, sample a negative item uniformly at random from the set of items that the user has not interacted with.
Compute the difference between predicted score of positive item and negative item, where prediction is the dot product of the corresponding latent vectors.
Updates user and item latent vectors using stochastic gradient descent (SGD) to minimize the negative log-likelihood of observed interactions and the sampled negative interactions.
During training, Bayes' rule is used to calculate posterior distribution over the parameters given the observed data. This helps the model handle uncertainty and avoid overfitting.
Make predictions using trained model on new user-item pairs by computing dot product of their latent vectors and using the resulting score for predicted preference or ranking.

The Why:

Reasons to use matrix factorization technique for recommendation models:

Can handle large and sparse datasets efficiently.
Reduces the dimensionality of original dataset by representing it in terms of latent factors, which reduces computational time compared to collaborative filtering.
Unlike simpler methods like collaborative filtering or content-based filtering, matrix factorization can discover under the hood patterns and interactions between users and items that may not be apparent from the raw data.
Since it relies on the latent factors rather than explicit features, it can make accurate predictions for new users or items with very little data.
Side information about users or items, such as demographics or textual descriptions, can be incorporated into the model as additional features.
It can be used as a pre-processing step to extract meaningful features for other algorithms like decision trees or neural networks.

The Why Not:

Reasons to not use matrix factorization:

Matrix factorization is prone to overfitting and under fitting particularly for datasets with small numbers of users or items.
Since it involves latent factors that are not directly related to the original features, it can be difficult to interpret the learned parameters or understand how they relate to user preferences or item characteristics.
Matrix factorization is a model based approach, that means it need to be trained regularly to adjust for changes in user-item interactions.
Matrix factorization is sensitive to hyper-parameter (number of latent factors, regularization parameters) selection.

Time for you to support:

Reply to this article with your question
Forward/Share to a friend who can benefit from this
Chat on Substack with BxD (here)
Engage with BxD on LinkedIN (here)

In next coming posts, we will cover one more recommendation model - Hybrid Recommender Systems.

Post that, we will start with time series models such as ARIMA, Exponential Smoothing (ES), SARIMA, Vector Autoregression (VAR), Prophet, Hidden Markov Models

Let us know your feedback!

Until then,

Have a great time! ??

#businessxdata?#bxd?#matrix #factorization #recommendationsystems #primer

BUSINESS x DATA

765 位关注者

要查看或添加评论，请登录

Mayank K.的更多文章

What we look for in new recruits?

2024年9月22日

What we look for in new recruits?

Personalization is the #1 use case of most of AI technology (including Generative AI, Knowledge Graphs…
500+ Enrollments, ?????????? Ratings and a Podcast

2024年9月14日

500+ Enrollments, ?????????? Ratings and a Podcast

We are all in for AI Driven Marketing Personalization. This is the niche where we want to build this business.
What you mean 'Build A Business'?

2024年9月7日

What you mean 'Build A Business'?

We are all in for AI Driven Personalization in Business. This is the niche where we want to build this business.
Why 'AI-Driven Personalization' niche?

2024年8月31日

Why 'AI-Driven Personalization' niche?

We are all in for AI Driven Personalization in Business. In fact, this is the niche where we want to build this…
Entering the next chapter of BxD

2024年8月24日

Entering the next chapter of BxD

We are all in for AI Driven Personalization in Business. And recently we created a course about it.

1 条评论
We are ranking #1

2024年8月17日

We are ranking #1

We are all in for AI Driven Personalization in Business. And recently we created a course about it.
My favorites from the new release

2024年7月27日

My favorites from the new release

The Full version of BxD newsletter has a new home. Subscribe on LinkedIn: ?? https://www.
Many senior level jobs inside....

2024年7月7日

Many senior level jobs inside....

Hi friend - As you know, we recently completed 100 editions of this newsletter and I was the primary publisher so far…
People need more jobs and videos.

2024年6月29日

People need more jobs and videos.

From the 100th edition celebration survey conducted last week- one point is standing out that people need more jobs and…
BxD Saturday Letter #202425

2024年6月22日

BxD Saturday Letter #202425

Please take 2 mins to send your feedback. Link: https://forms.

See all articles

BxD Primer Series: Matrix Factorization Recommendation Models

Mayank K.

Founding Partner - BUSINESS x DATA

The What:

The How:

Matrix Factorization Algorithms:

领英推荐

The Why:

The Why Not:

Time for you to support:

BUSINESS x DATA

765 位关注者

Mayank K.的更多文章

社区洞察

其他会员也浏览了

Artificial Intelligence - Part 7.2 - GENERATIVE AI - Transformer Models

BxD Primer Series: DBSCAN Clustering Models

Day 30 — Hyperparameter Optimization

Hello World - Machine Learning & Neural Network

BxD Primer Series: Bagging Ensemble Models

BxD Primer Series: Linear Discriminant Analysis (LDA) for Dimensionality Reduction

Hyperparameters in Machine Learning: A Comprehensive Guide

BxD Primer Series: Knowledge-based Recommendation Models

Difference Between Softplus and Softmax Activation Functions

The What:

The How:

Matrix Factorization Algorithms:

领英推荐

The Why:

The Why Not:

Time for you to support:

BUSINESS x DATA

765 位关注者

Mayank K.的更多文章

What we look for in new recruits?

500+ Enrollments, ?????????? Ratings and a Podcast

What you mean 'Build A Business'?

Why 'AI-Driven Personalization' niche?

Entering the next chapter of BxD

We are ranking #1

My favorites from the new release

Many senior level jobs inside....

People need more jobs and videos.

BxD Saturday Letter #202425

社区洞察

其他会员也浏览了

Artificial Intelligence - Part 7.2 - GENERATIVE AI - Transformer Models

BxD Primer Series: DBSCAN Clustering Models

Day 30 — Hyperparameter Optimization

Hello World - Machine Learning & Neural Network

BxD Primer Series: Bagging Ensemble Models

BxD Primer Series: Linear Discriminant Analysis (LDA) for Dimensionality Reduction

Hyperparameters in Machine Learning: A Comprehensive Guide

BxD Primer Series: Knowledge-based Recommendation Models

Difference Between Softplus and Softmax Activation Functions