登录查看更多内容

Tutorial on Random Forest and Parameter Tuning in R

Manish Saraswat

Senior Machine Learning Engineer

发布日期: 2016年12月14日

Introduction

Random Forest is one of the most versatile machine learning algorithms available today. With its built-in ensembling capacity, the task of building a decent generalized model (on any dataset) gets much easier. However, I've seen people using random forest as a black box model; i.e., they don't understand what's happening beneath the code. They just code.

In fact, the easiest part of machine learning is coding. If you are new to machine learning, the random forest algorithm should be on your tips.

In this article, I'll explain the complete concept of random forest and bagging. For ease of understanding, I've kept the explanation simple yet enriching. I've used MLR, data.table packages to implement bagging, and random forest with parameter tuning in R. Also, you'll learn the techniques I've used to improve model accuracy from ~82% to 86%.

What is the Random Forest algorithm?
How does it work? (Decision Tree, Random Forest)
What is the difference between Bagging and Random Forest?
Advantages and Disadvantages of Random Forest
Solving a Problem
Parameter Tuning in Random Forest

Read Article

Do drop in your comments to share some suggestions or knowledge while working with random forests. I'd love to know!

Ariel Novelli

Machine Learning/ AI Consultant

8 年

Hi Manish, thanks for your posts! Wondering if you know where I can find some info / tutorial / posts about manipulating data in Haadop system using R. Thanks. Ariel.

1 次回应

要查看或添加评论，请登录

Manish Saraswat的更多文章

Practial Guide on Text Mining and Feature Engineering in R

2017年4月10日

Practial Guide on Text Mining and Feature Engineering in R

The ability to deal with text data is one of the important skills a data scientist must posses. With advent of social…
Start with Deep Learning & Parameter Tuning with MXnet, H2o Package in R

2017年1月31日

Start with Deep Learning & Parameter Tuning with MXnet, H2o Package in R

Introduction Deep Learning isn't a recent discovery. The seeds were sown back in the 1950s when the first artificial…

2 条评论
Practical Guide to Clustering Algorithms & Evaluation in R

2017年1月19日

Practical Guide to Clustering Algorithms & Evaluation in R

Introduction Clustering algorithms are a part of unsupervised machine learning algorithms. Why unsupervised ? Because…
How can R Users Learn Python for Data Science ?

2017年1月13日

How can R Users Learn Python for Data Science ?

Introduction This article is meant to help R users to enhance their set of skills and learn Python for data science…

9 条评论
Practical Guide to Logistic Regression Analysis in R

2017年1月5日

Practical Guide to Logistic Regression Analysis in R

Introduction Recruiters in analytics/data science industry expect you to know atleast two algorithms: Linear Regression…
SQL Tutorial on Data Analysis in R

2016年12月28日

SQL Tutorial on Data Analysis in R

Introduction Many people are pursuing data science as a career (to become a data scientist) choice these days. With the…
XGBoost Tutorial in R (from Scratch)

2016年12月20日

XGBoost Tutorial in R (from Scratch)

Introduction Lately, I've come to know that a lot of newbies in R are keen to use xgboost package at best. And, why…

2 条评论
Beginners Guide to Regression Analysis and Plot Interpretations

2016年12月6日

Beginners Guide to Regression Analysis and Plot Interpretations

"The Road to Machine Learning starts with Regression. Are you ready?" If you are aspiring to become a data scientist…
Machine Learning Project on Imbalanced Data set in R

2016年9月21日

Machine Learning Project on Imbalanced Data set in R

Lot of us get rejected during data science / machine learning interviews. Do you know why? Because, their resumes never…
Questions on Machine Learning & Statistics - Can you answer?

2016年9月16日

Questions on Machine Learning & Statistics - Can you answer?

With this article, I've tried to summarize the extensive machine learning subject, into 40 tricky & thoughtful…

7 条评论

See all articles

Tutorial on Random Forest and Parameter Tuning in R

Manish Saraswat

Senior Machine Learning Engineer

Introduction

Table of Contents

Read Article

Manish Saraswat的更多文章

社区洞察

其他会员也浏览了

No Free Lunch, Computer Vision - 1

Kalman Filter: The first dive

Day 06 — Support Vector Machine

Machine Learning in R

Day 7: k-Nearest Neighbors (k-NN)

AI_Part_2_Regression Models with Codes

80% Titanic Fatality Prediction: #ClaudeNoCode

Random Forest

k-nearest neighbors algorithm

Understanding Gradient Boosting Machines?—?using XGBoost and LightGBM parameters

Introduction

Table of Contents

Read Article

Manish Saraswat的更多文章

Practial Guide on Text Mining and Feature Engineering in R

Start with Deep Learning & Parameter Tuning with MXnet, H2o Package in R

Practical Guide to Clustering Algorithms & Evaluation in R

How can R Users Learn Python for Data Science ?

Practical Guide to Logistic Regression Analysis in R

SQL Tutorial on Data Analysis in R

XGBoost Tutorial in R (from Scratch)

Beginners Guide to Regression Analysis and Plot Interpretations

Machine Learning Project on Imbalanced Data set in R

Questions on Machine Learning & Statistics - Can you answer?

社区洞察

其他会员也浏览了

No Free Lunch, Computer Vision - 1

Kalman Filter: The first dive

Day 06 — Support Vector Machine

Machine Learning in R

Day 7: k-Nearest Neighbors (k-NN)

AI_Part_2_Regression Models with Codes

80% Titanic Fatality Prediction: #ClaudeNoCode

Random Forest

k-nearest neighbors algorithm

Understanding Gradient Boosting Machines?—?using XGBoost and LightGBM parameters