登录查看更多内容

Are you wasting your time to Learn "Python For Data Science"?

Vivek Chaudhary

Transforming Regional Ai | Leading @dyota labs

发布日期: 2020年3月14日

Reason Why you are not getting into job as a Data Scientist?

It's simple, Only Learning is not the key to land your job as a Data Scientist,Experience with Learning make your ML model more fit & chance to crack the interview(When you start learning any course,you won't apply it practically & that is where we start wrong way of learning i.e. without experience instead what things you have learned just try those while working with Data Set & what thing you need to learn to apply "Learn at the same time" & apply,believe me you will enhance your way of learning & able to explain stuff by your own way).

Data Science is more of your passion,curiosity & importantly research with patience.You can work with best way of learning while looking out with some existing project,understand the required problem statement & 'learn to analyze the code at first',work out with at least 5 project & then take any different problem statement 'you can analyze progress of learning with experience'.

Many of the you wasting your time to learn python ,visualization libraries like pandas,numpy etc,. & start searching for job or internship in Data Science Domain which is a Myth because you won't get it for sure.You just have to know what & where you have to use such skills as a Data Scientist & .instead focus on core python which can help you in many ways(Remember::Analyzing code is difficult as compare to writing a code,understand the concept & practice as much as you can with different problem statement).

Whether you believe or not but no one is perfect in coding difference is that,they know what graph they have to plot & search for the same over google then implementing the same.

We should thanks bamboolib.com to make our work more easier which can save our up-to 12 hour while solving any particular problem statement.

What is Bamboolib?

Bamboolib is a python package for easy data exploration & transformation with pandas. You can use it with Jupyter Notebook or JupyterLab. ... All transformations come with full keyboard control, making bamboolib the first GUI loved both by pandas-savvy users as well as Python novices.

Why do we need such tools?

You know the answer very well,when you want to plot any graph with one parameter or with different column names,you always search the code & implementing(around 90% of them do the same) which is not wrong & we waste around 12 to 15 hour ,with bamboolib we can save the time & work out with more better insights to proceed with.

You may have a question,that now getting job will be easy?

No,it won't as anyone can use the tool & plot the graph but as a Data Scientist you should think first,research according to your problem statement then only you can make best use of such tools which can be the tools for those who love Research & Statistics.

Even high school student know statistics(mean,mode,median & standard deviation etc) but what difference you can make with statistics as a Data Scientist.

Crazy part is when you have been asked how you will use mean,mode & median?

What you will answer is "Formula" but think once what will be difference between you & high school student then.:)

Let's start how you can use bamboolib with kaggle step by step implementation(Even you can work with jupyter lab & jupyter notebook):

First step,do open kaggle kernel & on the right upper you can find an option to enable internet as shown in below figure::

Then do use above code shown in image to import bamboolib with kaggle kernel & run it.

After successful run of particular code shown in above image,do refresh the page & wait for some seconds.

After refresh,do use above code shown in image to import data set successfully.

Then do call your data frame that you have initialized as shown above.

After particular code run successfully you can see the option "bamboolib UI",do click & you can see the above image as shown.

As we can see "Sex" column which have two unique values i.e. female & male ,in python if we have to convert this column in two columns or one columns with respect to 0 & 1 where 0 will represent female & 1 will represent male.

just search 'one hot encoding' in search transformation bar & when you click ,on the right side table will be pop as shown & choose your column where you have to apply one hot encoding as shown in below figure.

As shown in above image when you select particular column name you can see two option,where first option will convert female to 0 & male to 1 & another option 'create dummy for missing values' will divide sex column in two parts with male & female.

It's up-to you how you have to take it,here we have choose first option

Let's see what change we can see after as shown in below image

If you want to change column name just double click on particular column 'sex' changed to 0 for female & 1 for male.You can also change name of the column as well 'sex' to "Gender".

You just have to double click on'sex' column & another small window will pop up at right as shown in figure.

Change the name,click on rename you will see the result & in summary part you can see all unique values with percentage of missing values ,so that you can work accordingly further to handle missing values.

Column name changed successfully as you can see in below image & changed to 0 & 1.

As you can see another column "birth_year" & we have to change it into 'age',how let' s see simple code dx["Age"] = 2020 - dx["birth_year],to implement this search 'New Column Formula' in search transformation column as shown below "bamboolib UI option'.

As you can see below image type the formula & click on execute.

Then we can see the changes in below image where new column created "Age"(you have to scroll the screen till last,as all new column will append by default & you have to delete column 'birth_year' now.

As we have seen some of the Data Cleaning part using bamboolib & trust we have come through only 2% of part,you can do much more things,so what you are waiting for.

Give it a try!!

Now ,as shown in below image do click on 'Explore DataFrame' & you can observe we can plot almost all different types of visualization with great Info from the Data Set.

As seen in below image after you click on 'Explore DataFrame',Where 'Glimpse' will give you overview of Data Set for each column ,'Columns' will show you all the parameters present in given Data Set,'Predictor patterns' will show you the relation of all parameters &'Correlation Matrix' will show you which parameter is strongly correlated or not,so that you can make use of such information to proceed further.

If you click on 'create plot' below screen will pop up where you can select type of plot you want to see & between which parameters.As seen in figure we have select 'Histogram' & 'infection_reason' & you can also plot graph between two variables while adding another parameter name in 'Add property' vice versa.

Asap you select,you can see the below output within some seconds,where you can analyze that there are more number of 'NAN' values present & you have to figure out how you will deal with 'Null' values.

You can work with more stuff with bamboolib,it's just an basics we have gone through.

As shown in below figure we are plotting histogram with another column name.

Below is the output you can see in below image.

So,what you are waiting for.

Are you going to start your career but to be very frank do not waste your money to the expensive courses before knowing whether Data Science is your cup of tea or not.

Do connect with me & schedule your free meeting to start your career as a Data Scientist(do drop a mail to [email protected]).

Rameez Mohammad

Pgp diplamo at International School of Engineering (INSOFE)

4 年

Could you please share the program schedule and details

Mukesh Kumar

Lead Data Scientist - Analytics & AI

5 年

Bamboolib is a good tool but their pricing is little bit high..i feel..#personalopinion

Shubham Agarwal

A-SPICE & ALM/PLM (PTC Integrity, PTC Windchill, PTC ThingWorx & Siemens Polarion) Implementation Consultant.

5 年

Ankita Kapoor Srijan Srivastava

Aparajita Ojha

5 年

Bhavesh salvi

Sourav B.

Software Engineer

5 年

vivek chaudhary Great Work! Impressive

1 次回应

查看更多评论

要查看或添加评论，请登录

Vivek Chaudhary的更多文章

Importance Of Generalized Statistics!

2022年8月31日

Importance Of Generalized Statistics!

I know you might be thinking, what is this new term called Generalized Statistics ? Let me ask you a simple question to…

2 条评论
AI Engineers are not genies.

2022年5月23日

AI Engineers are not genies.

Hi #connections thanks for your support to start this different culture while sharing the experience I had with one of…

22 条评论
20 Days Data Science Bootcamp

2020年8月25日

20 Days Data Science Bootcamp

We strongly believe that building Machine Learning model is not that much important instead learn how to make story &…

16 条评论
Feel The Pain(ML Bootcamp)

2020年8月15日

Feel The Pain(ML Bootcamp)

Again we are back with one more issue that individuals are facing with Data Science domain now a days & reaching out to…

2 条评论
Hear "The Unheard"

2020年8月9日

Hear "The Unheard"

As a human being we all get attached to the people around us in different ways but when people leave that feeling is…

4 条评论
Demystifying Success!!

2020年7月28日

Demystifying Success!!

"I have seen kings unhappy & many shoemakers living happily"--Said by Shakespeare's. 24th of july'18 decided to…

3 条评论
Project Based Mentorship Program

2020年7月24日

Project Based Mentorship Program

Again we are back with one more issue that individuals are facing with Data Science domain now a days & reaching out to…

2 条评论
Unique Data Science Learning Path

2020年7月3日

Unique Data Science Learning Path

Hey, how you all are doing!! No need to get panic & confused what things this program will consists. You have to be…

7 条评论
Python Web Scraping From Zero To Hero!!

2020年6月29日

Python Web Scraping From Zero To Hero!!

As we know Data Science is the emerging field & python is mostly used almost by 95% of the Data Scientist. What if…

4 条评论
Experience Based Mentorship Program

2020年6月9日

Experience Based Mentorship Program

As per our research since 4 to 5 months we have been observed that there are "N' number of individual complete their…

9 条评论

See all articles

Are you wasting your time to Learn "Python For Data Science"?

Vivek Chaudhary

Transforming Regional Ai | Leading @dyota labs

Vivek Chaudhary的更多文章

社区洞察

其他会员也浏览了

DABL

Ten Essential Python Libraries for Data Science Beginners

Top 12 Python Skills Every Data Scientist Should Learn

Tools for Data Collection and Processing: Integrating Python, AI, and Machine Learning

Data Science Skills To Learn in 2021

Unlocking the Power of Synthetic Data - How Python Faker Package Might be Changing the Game for Data Scientists

Why Is Python Used for Machine Learning

Python vs R – Who Is Really Ahead in Data Science, Machine Learning?

Which Python libraries are recommended for data science and machine learning projects?

Stock Analysis and Prediction Using Python: A Step-by-Step Guide

Vivek Chaudhary的更多文章

Importance Of Generalized Statistics!

AI Engineers are not genies.

20 Days Data Science Bootcamp

Feel The Pain(ML Bootcamp)

Hear "The Unheard"

Demystifying Success!!

Project Based Mentorship Program

Unique Data Science Learning Path

Python Web Scraping From Zero To Hero!!

Experience Based Mentorship Program

社区洞察

其他会员也浏览了

DABL

Ten Essential Python Libraries for Data Science Beginners

Top 12 Python Skills Every Data Scientist Should Learn

Tools for Data Collection and Processing: Integrating Python, AI, and Machine Learning

Data Science Skills To Learn in 2021

Unlocking the Power of Synthetic Data - How Python Faker Package Might be Changing the Game for Data Scientists

Why Is Python Used for Machine Learning

Python vs R – Who Is Really Ahead in Data Science, Machine Learning?

Which Python libraries are recommended for data science and machine learning projects?

Stock Analysis and Prediction Using Python: A Step-by-Step Guide