登录查看更多内容

Movie Recommendation System via Machine Learning

Mahdi Karami

Looking for Opportunities in Numerical Simulations, Software Development, ML, and Data Science

发布日期: 2023年2月5日

Note! The full version of the article and the implementation of the algorithm in R (programming language) are available on GitHub.

Introduction

A recommendation system is a technology that uses machine learning algorithms to suggest items to users based on their ratings and preferences. These systems analyze users' behavior, preferences, and rating history, as well as information about the items themselves, to propose personalized recommendations.

Movie recommendation systems are commonly used by movie streaming companies, such as Amazon and Netflix, to catch their users' tastes and suggest the corresponding movies and series to them. The main technical issue about the recommendation systems is data sparsity, indicating that among a large list of movies and users, most of the possible movie-user pairs do not exist. It means that only a small portion of users rate a specific movie. A user only rates a very small portion of movies as well.

Database Inspection

A database of 10M movie ratings has been provided by MovieLens which includes the rating from 70,000 users to 10,000 movies. The ratings range from 0.5 (The worst) to 5 (the best), with increments of 0.5. The plots below display the distribution of ratings received by movies, as well as those given by users.

No alt text provided for this image — Number of Ratings Received by Movies

There are about 20 different genres that describe the content of movies. Each movie can be classified into multiple genres.

Methodology

Similar to any machine learning task, the available database is divided into two parts: train set (90%) and test set (10%), where the former contributes to building the predicting model while the latter is used to estimate the accuracy of the model.

Rapid Innovation 7 个月前

o1-Preview?—?Everything You Need to Know About…

Ritesh Kanjee 2 个月前

Demystifying Machine Learning: A Comprehensive Guide…

Iain Brown Ph.D. 1 年前

A linear model can be assumed to estimate the rating value as follows.

The terms on the right-hand side of the formula respectively indicate the average rating of the database, the biased rating for a specific movie, the bias of each user, the bias related to the genres, and the error of estimation compared to the actual rating. The least-square estimation would be engaged to find the unknown coefficients in order to minimize the summation of errors.

Regularization

Regularization is a technique widely used in machine learning algorithms to prevent overfitting by penalizing large coefficients created by predicting models.

To apply the regularization method, a part of the train set should be separated (called the validation set). It does not contribute to building the model and is engaged to estimate the RMSE. The optimum value of the regularization parameter (lambda) will minimize the RMSE (root-mean-squared error).

Results

The results of the linear model with different options (i.e., considered terms in the modeling formula) are shown in the table and figure below.

Note! The full version of the article and the implementation of the algorithm in R (programming language) are available on GitHub.

要查看或添加评论，请登录

查看全部

Movie Recommendation System via Machine Learning

Mahdi Karami

Looking for Opportunities in Numerical Simulations, Software Development, ML, and Data Science

Introduction

Database Inspection

Methodology

领英推荐

Regularization

Results

更多精彩文章

社区洞察

其他会员也浏览了

Machine Learning Fundamentals: An Introduction To Algorithms

Machine Learning Fundamentals: An Introduction To Algorithms

Applied Machine Learning: Linear Regression, LassoCV, ElasticNet, RidgeCV, and xgboost

Do you want to learn more about Machine Learning but don't know where to begin?

Supervised Machine Learning

Demystifying Machine Learning: A Beginner’s Guide

Decoding ML - From Basic Concepts to Complex Challenges

Demystifying the Machine: Essential Skills for Machine Learning

Supervised Learning: Regression and Classification

Machine Learning: Introduction and Practical Example

Introduction

Database Inspection

Methodology

领英推荐

Regularization

Results

Happy Pi Day

2023年3月15日

GraphHandler: A We Application for Digitizing Graphs and Plots (with React JS)

2023年3月13日

A Powerful Web Application for Psychrometric Calculations

2023年3月9日

Mastering Text Investigation with Regular Expressions in Python

2023年3月7日

How Cython Combines the Power of C++ and Python for High-Performance Mathematical Calculations (Part #2)

2023年3月6日

How Cython Combines the Power of C++ and Python for High-Performance Mathematical Calculations (Part #1)

2023年3月6日

Flask: The Lightweight and Flexible Python Framework for Building APIs for Image Classification

2023年3月3日

Unlocking the potential of LS-DYNA with Python automation

2023年3月2日

Web Scraping with Python: Part 2

2023年2月21日

Web Scraping with Python: Part 1

2023年2月21日

社区洞察

其他会员也浏览了

Machine Learning Fundamentals: An Introduction To Algorithms

Machine Learning Fundamentals: An Introduction To Algorithms

Applied Machine Learning: Linear Regression, LassoCV, ElasticNet, RidgeCV, and xgboost

Do you want to learn more about Machine Learning but don't know where to begin?

Supervised Machine Learning

Demystifying Machine Learning: A Beginner’s Guide

Decoding ML - From Basic Concepts to Complex Challenges

Demystifying the Machine: Essential Skills for Machine Learning

Supervised Learning: Regression and Classification

Machine Learning: Introduction and Practical Example