About Random Forest Algorithms.
Dishant Kharkar
Physics Faculty | Data Science & AI Enthusiast | Educator & Analyst | Exploring the Intersection of Physics & Data Science
What is Random Forest?
Before diving into the Random forest algorithm. first, we have to understand the Ensemble technique and Bagging.
Ensemble Learning:
In this Article, we study Bagging for random forest algorithm.
Bagging :
The bagging process involves the following steps:
Why use Random Forest, when we have a Decision tree?
However, it's important to note that Decision Trees also have their advantages, such as being easier to interpret and visualize, and they can capture specific relationships between features and the target variable. In some cases, a Decision Tree may be sufficient if interpretability is crucial, the dataset is small, or there are specific requirements that favour a single tree model.
How does the Random Forest algorithm work?
Random Forest works in two-phase first is to create the random forest by combining N decision trees, and the second is to make predictions for each tree created in the first phase.
The Working process can be explained in the below steps and diagram:
Step-1: Select random K data points from the training set.
Step-2: Build the decision trees associated with the selected data points (Subsets).
Step-3: Choose the number N for the decision trees that you want to build.
Step-4: Repeat Steps 1 & 2.
Step-5: For new data points, find the predictions of each decision tree, and assign the new data points to the category that wins the majority votes.
The working of the algorithm can be better understood by the below example:
Example: Suppose there is a dataset that contains multiple fruit images. So, this dataset is given to the Random forest classifier. The dataset is divided into subsets and given to each decision tree. During the training phase, each decision tree produces a prediction result, and when a new data point occurs, then based on the majority of results, the Random Forest classifier predicts the final decision. Consider the below image:
Important Hyperparameters in Random Forest:
Hyperparameters are used in random forests to either enhance the performance and predictive power of models or to make the model faster.
领英推荐
Hyperparameters to Increase the Predictive Power:
Bootstrap - True - Sampling with replacement
Bootstrap - False - sampling without replacement.
Hyperparameters to Increase the Speed:
Advantages of Random Forest:
Limitations and Considerations:
Applications of Random Forest:
Random Forest has found extensive applications across various domains:
Conclusion:
For model implementation in Python: https://github.com/Dishantkharkar/Machine_learning_Models/blob/main/Decision%20tree%20Random%20forest/Decision_Tree_RandomForest.ipynb
If you learned something from this blog, make sure you give it a ????
Will meet you in some other blog, till then Peace ???.
?
Thank_You_