Exploring Oblique Decision Trees in Python
Yeshwanth Nagaraj
Democratizing Math and Core AI // Levelling playfield for the future
Oblique Decision Trees are an advancement in the field of decision tree algorithms, known for their ability to create more complex boundaries than traditional axis-aligned trees. Unlike conventional decision trees that split data based on a single feature at each node, oblique decision trees use linear combinations of features, making them particularly effective for high-dimensional data.
Python Example
from sklearn.datasets import load_boston
from sklearn.ensemble import BaggingRegressor
from sklearn.model_selection import cross_val_score
from scikit_obliquetree.HHCART import HouseHolderCART
from scikit_obliquetree.segmentor import MSE, MeanSegmentor
X, y = load_boston(return_X_y=True)
reg = BaggingRegressor(
HouseHolderCART(MSE(), MeanSegmentor(), max_depth=3),
n_estimators=100,
n_jobs=-1,
)
print('CV Score', cross_val_score(reg, X, y))
Advantages and Disadvantages
- Advantages:Better handling of complex, non-linear decision boundaries.Effective in high-dimensional spaces.
- Disadvantages:Potentially more computationally intensive.Can be harder to interpret compared to axis-aligned trees.
The concept of oblique decision trees was not attributed to a single inventor but evolved as a sophisticated variation of decision trees in machine learning.