登录查看更多内容

Tools for Smart/Lazy Data Scientists (ft. LazyPredict)

Rithwik Chhugani

Data Scientist | Data & AI Consultant

发布日期: 2020年12月22日

Being a data scientist you don't necessarily need to write tons and tons of code to see the performance of your models. You have to follow the same procedure to build several other models as well. A smart person will always try to think about a better way if the task has too many repetitive steps.

Following that ideology, we'll be discussing a library named LazyPredict in this article. LazyPredict is not the only library out there that equips you with such functionality. Other competitors of LazyPredict are as follows:

tpot
h2o automl
auto keras
Auto-sklearn
AutoML
MLBox
Pycaret (My personal favorite ??)
Uber Ludwig

Some of the aforementioned libraries also give you the ability to tune your model by adding in one more line of code. We don't see that feature with LazyPredict, but it's good at what it's supposed to do. Let's see how to implement the use LazyPredict for classification and regression on some sample datasets.

Before starting to code, we need to install the library. While installing you might run into some dependencies error, but no need to panic. Just download the packages that are missing and you should be good to go.

pip install lazypredict

Implementing Classification

import lazypredict
from lazypredict.Supervised import LazyClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
data = load_breast_cancer()
X = data.data
y= data.target
X_train, X_test, y_train, y_test = train_test_split(X, y,test_size=.5,random_state =123)
clf = LazyClassifier(verbose=0,ignore_warnings=True, custom_metric=None)
models,predictions = clf.fit(X_train, X_test, y_train, y_test)

models

If everything runs without any errors, then you'll see the above output on your screen. Within seconds you will have all the models and then you could select the one which best fits your problem statement.

Implementing Regression

from lazypredict.Supervised import LazyRegressor
from sklearn import datasets
from sklearn.utils import shuffle
import numpy as np
boston = datasets.load_boston()
X, y = shuffle(boston.data, boston.target, random_state=13)
X = X.astype(np.float32)
offset = int(X.shape[0] * 0.9)
X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]
reg = LazyRegressor(verbose=0,ignore_warnings=False, custom_metric=None )
models,predictions = reg.fit(X_train, X_test, y_train, y_test)

models

Smart people learn from their mistakes. But the real sharp ones learn from the mistakes of others

要查看或添加评论，请登录

Rithwik Chhugani的更多文章

Why use Spark?

2021年1月10日

Why use Spark?

The very first reason you'll find on the internet is to work with Big Data, but why not use Pandas, Hadoop, or Dask?…
Vanilla Regression VS Robust Regression

2020年12月19日

Vanilla Regression VS Robust Regression

Regression is one of the most widely used algorithms for forecasting. Regression is the first thing you'd learn in the…
Likelihood VS Probability

2020年12月17日

Likelihood VS Probability

It may look simple, but it's capable to create head-scratching situations at times. Let's understand in a few words…
Popular CNN Architectures

2020年12月17日

Popular CNN Architectures

Every now and then researchers try to fine-tune their existing model or come up with new architectures to win the…
Types of Hyperparameter Tuning

2020年12月16日

Types of Hyperparameter Tuning

What is hyperparameter tuning? Hyperparameter tuning is an extra step to make sure that your model is using the right…

See all articles

Tools for Smart/Lazy Data Scientists (ft. LazyPredict)

Rithwik Chhugani

Data Scientist | Data & AI Consultant

Implementing Classification

Implementing Regression

Rithwik Chhugani的更多文章

社区洞察

其他会员也浏览了

Cleaning the DATA

What does a data scientist do?

My Third Win in Kaggle's Data Science for Good Competition (with key tips)

Skills to build data science models in the real world

Feature Selection for faster analytics

How Data Science Project Works - From the Koobiyo Teledrama

K-Means Clustering: An Introduction to Grouping Data for Improved Insights

Data Drift: What, Why & How to Detect It?

Decision Trees and Random Forests in Data Science

Time to bid SAS, Goodnight!

Implementing Classification

Implementing Regression

Rithwik Chhugani的更多文章

Why use Spark?

Vanilla Regression VS Robust Regression

Likelihood VS Probability

Popular CNN Architectures

Types of Hyperparameter Tuning

社区洞察

其他会员也浏览了

Cleaning the DATA

What does a data scientist do?

My Third Win in Kaggle's Data Science for Good Competition (with key tips)

Skills to build data science models in the real world

Feature Selection for faster analytics

How Data Science Project Works - From the Koobiyo Teledrama

K-Means Clustering: An Introduction to Grouping Data for Improved Insights

Data Drift: What, Why & How to Detect It?

Decision Trees and Random Forests in Data Science

Time to bid SAS, Goodnight!