Tools for Smart/Lazy Data Scientists (ft. LazyPredict)
Being a data scientist you don't necessarily need to write tons and tons of code to see the performance of your models. You have to follow the same procedure to build several other models as well. A smart person will always try to think about a better way if the task has too many repetitive steps.
Following that ideology, we'll be discussing a library named LazyPredict in this article. LazyPredict is not the only library out there that equips you with such functionality. Other competitors of LazyPredict are as follows:
- tpot
- h2o automl
- auto keras
- Auto-sklearn
- AutoML
- MLBox
- Pycaret (My personal favorite ??)
- Uber Ludwig
Some of the aforementioned libraries also give you the ability to tune your model by adding in one more line of code. We don't see that feature with LazyPredict, but it's good at what it's supposed to do. Let's see how to implement the use LazyPredict for classification and regression on some sample datasets.
Before starting to code, we need to install the library. While installing you might run into some dependencies error, but no need to panic. Just download the packages that are missing and you should be good to go.
pip install lazypredict
Implementing Classification
import lazypredict from lazypredict.Supervised import LazyClassifier from sklearn.datasets import load_breast_cancer from sklearn.model_selection import train_test_split data = load_breast_cancer() X = data.data y= data.target X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.5,random_state =123) clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None) models,predictions = clf.fit(X_train, X_test, y_train, y_test) models
If everything runs without any errors, then you'll see the above output on your screen. Within seconds you will have all the models and then you could select the one which best fits your problem statement.
Implementing Regression
from lazypredict.Supervised import LazyRegressor from sklearn import datasets from sklearn.utils import shuffle import numpy as np boston = datasets.load_boston() X, y = shuffle(boston.data, boston.target, random_state=13) X = X.astype(np.float32) offset = int(X.shape[0] * 0.9) X_train, y_train = X[:offset], y[:offset] X_test, y_test = X[offset:], y[offset:] reg = LazyRegressor(verbose=0,ignore_warnings=False, custom_metric=None ) models,predictions = reg.fit(X_train, X_test, y_train, y_test) models
Smart people learn from their mistakes. But the real sharp ones learn from the mistakes of others