Tutorial on Random Forest and Parameter Tuning in R
Introduction
Random Forest is one of the most versatile machine learning algorithms available today. With its built-in ensembling capacity, the task of building a decent generalized model (on any dataset) gets much easier. However, I've seen people using random forest as a black box model; i.e., they don't understand what's happening beneath the code. They just code.
In fact, the easiest part of machine learning is coding. If you are new to machine learning, the random forest algorithm should be on your tips.
In this article, I'll explain the complete concept of random forest and bagging. For ease of understanding, I've kept the explanation simple yet enriching. I've used MLR, data.table packages to implement bagging, and random forest with parameter tuning in R. Also, you'll learn the techniques I've used to improve model accuracy from ~82% to 86%.
Table of Contents
- What is the Random Forest algorithm?
- How does it work? (Decision Tree, Random Forest)
- What is the difference between Bagging and Random Forest?
- Advantages and Disadvantages of Random Forest
- Solving a Problem
- Parameter Tuning in Random Forest
Read Article
Do drop in your comments to share some suggestions or knowledge while working with random forests. I'd love to know!
Machine Learning/ AI Consultant
8 年Hi Manish, thanks for your posts! Wondering if you know where I can find some info / tutorial / posts about manipulating data in Haadop system using R. Thanks. Ariel.