登录查看更多内容

The Making of a Data Scientist: How I Became One

Kari?ki K.

发布日期: 2019年4月6日

At the end of last year I decided I was going to become a machine learning expert. Before that I had created this positive vibe that made me feel like I could accomplish anything. I had put in place the habits I knew I needed if I was gonna be a high performance individual. So I had no doubt that I was gonna hack AI and would soon be consulting on very big projects, well, I am still hoping. What I did next was register for Jose Portilla's ML course on Udemy and started on my daily routine of waking up at 5 a.m. to put in one hour of focused study.

In the beginning, it was all a breeze. Installing Python, running the first app, installing the libraries using pip, this was a walk in the park, I could basically do that in my sleep. Then came more interesting parts. First up was exploring the libraries and the basics of data science and machine learning. I worked with Numpy and Pandas, and learned how and why we do exploratory data analysis. I found that my experience with Excel helped me get a hang of Pandas really quickly.

After that came data visualisation using Matplotlib, Seaborn, Plotly and Cufflinks. It is absolutely amazing to see the kind of graphs you can create with one line of code in Python using these libraries. I found that the graphs really do help to make better sense of the data in your hand which then allows you to decide what to do with it. Sometimes, the relationships that emerge during exploratory data analysis and visualisation could prove to be invaluable both to the data scientist as well as to the corporation.

At this point, I was dying for some actual machine learning action. You know, I wanted to see something think for itself, predict when I am gonna make my first million from data science, or something like that! It took discipline to actually stick with the course outline and laboriously go through geographical plotting section. So, I struggled with choropleth maps for two days. And I found them very highly abstracted. What I was doing in code seemed to have no relation to the magic that showed up on screen when I hit run. Anyway, if need be, I can now create one with some help from our old friends Google and Stack Overflow.

Eventually, after about four weeks of holding back from skipping the introduction and jumping straight into machine learning models, I was gently introduced to linear regression. You cannot imagine my shock on realising that this was stuff I already knew from my college statistics course. I actually took the time to go through the companion textbook for the course, Introduction to Statistical Learning with R (ISLR), and confirmed that it was indeed the same old regression. Take some known input vector (X1) and its known output vector (Y1), find a curve of best fit that generalizes the relationship of X1 with Y1. Bam! That curve of best fit is your first machine learning model, it can predict the value of unknown Y2 for some other input X2!

But then things started to happen too fast. I learned of the bias-variance tradeoff and the need for cross-validation. There was logistic regression, KNN algorithm, Decision Trees and Random Forests, and related concepts of bagging and boosting. I relied very heavily on the textbook to understand the internal workings of these algorithms and concepts before trying them out practically using Scikit-learn's toolkit. As such, progress was rather slow at this stage of my learning. And even though I can work with SVM, K Means Clustering, or TensorFlow, I know there is a lot of depth still unknown to me.

In my next article, I will write about the lessons I learned while trying to deploy my first ML model on a CentOS server using Flask and a Python virtual environment. Someone also requested I write about the work-ethic that allows me to explore this much while holding down an 8 to 5 job. Elico Sifuma, Nelson Mwangala, myself, and others are supporting and developing for over ten banks which are integrated or integrating into the agency banking system in Uganda, which is a first of its kind globally. Maybe someday I should also write about the lessons we have learnt so far.

Nevertheless, at this point I feel confident enough to start working on data science and machine learning projects. I have the necessary skills to do exploratory data analysis and visualisation. I can work with a number of machine learning models and deploy the same using Nginx, Flask, and uWSGI. Further, I now know what to Google to find the answers I do not have or to fix the inevitable bug or two. But, most importantly, I can communicate deep insights in various way to help non-technical people understand and use the secrets that the data is telling us.

michael khayeka

Mechatronic Engineer - Electronics| Embedded Systems| Industrial Automation | Hardware | QA Testing

5 年

I would like to engage in a data science project . So far my web scrapying skills using Scrapy and importing data scraped into csv and into a database like postgresql or mongoDB are up to point. Which approach should I take in the case whereby I have to analyse real live actual data and implement it in analysis basis, maybe like in term of lead generation in business. Any advice would be appreciated?

Richard S.

5 年

Well done! didn't know we are on a similar journey, more focused on the data engineering side of things

charles mungai

Data and Product

5 年

Impressive. I like your approach.?

1 次回应

Nelson Mwangala

Fintech | Payment Solutions | Enterprise Integrations | Expert Software Engineer | Solutions Architect

5 年

This is very commendable.I am inspired.

1 次回应

查看更多评论

要查看或添加评论，请登录

Kari?ki K.的更多文章

How to Deploy jPOS in JBOSS EAP-7.+

2019年11月19日

How to Deploy jPOS in JBOSS EAP-7.+

Recently, a production environment I am responsible for started failing with very weird errors. Everything pointed to…

3 条评论
Machine Vision: Making a Face Recognition Android App

2019年5月16日

Machine Vision: Making a Face Recognition Android App

This is the third article in an ongoing series on face recognition. In this article, I shall outline the process of…
Machine Vision: Making a Face Recognition Back-end

2019年5月14日

Machine Vision: Making a Face Recognition Back-end

In my previous article, I promised to write a more technical series of articles that chronicle the journey of creating…
Machine Vision: First Face Recognition App

2019年5月14日

Machine Vision: First Face Recognition App

On 18th March 2019, a senior colleague invited all coders in our organization to contribute to ongoing internal…

3 条评论
Why You Must Think Out of The Box

2019年1月10日

Why You Must Think Out of The Box

Running a business is a lot like having some examination questions and then coming up with a set of multiple choice…
I Hereby Declare myself an iOS Developer

2018年12月21日

I Hereby Declare myself an iOS Developer

I have loved to code since I printed my first “Hello World!” on a cheap, second hand Dell Latitude D160 using C. I have…

10 条评论
To be a Better Leader, Be a Rock

2018年12月20日

To be a Better Leader, Be a Rock

I do not know where these thoughts come from sometimes. So the other day I am riding on a boda-boda headed home from a…

See all articles

The Making of a Data Scientist: How I Became One

Kari?ki K.

Kari?ki K.的更多文章

社区洞察

其他会员也浏览了

Software Engineer to Data Scientist

Simplifying key Data Science Concepts! (drafted by Dr Ratika Datta)

How to Develop a Stock Price Prediction Model: A Beginner's Guide

End to End Movie Recommendation System with Flask app

Accelerated Data Analytics: Machine Learning with GPU-Accelerated Pandas and Scikit-learn

Adventures in Data Science: From Wrangling Rogue Data to Predicting the Future (and Everything in Between)

Can I Become a Data Scientist in 6 Months? Probably Not, But It Depends

What are data science, big data, and machine learning?

Our Data Science Journey: From Zero to Hero ??

How do Data Science and AI help real estate Companies?

Kari?ki K.的更多文章

How to Deploy jPOS in JBOSS EAP-7.+

Machine Vision: Making a Face Recognition Android App

Machine Vision: Making a Face Recognition Back-end

Machine Vision: First Face Recognition App

Why You Must Think Out of The Box

I Hereby Declare myself an iOS Developer

To be a Better Leader, Be a Rock

社区洞察

其他会员也浏览了

Software Engineer to Data Scientist

Simplifying key Data Science Concepts! (drafted by Dr Ratika Datta)

How to Develop a Stock Price Prediction Model: A Beginner's Guide

End to End Movie Recommendation System with Flask app

Accelerated Data Analytics: Machine Learning with GPU-Accelerated Pandas and Scikit-learn

Adventures in Data Science: From Wrangling Rogue Data to Predicting the Future (and Everything in Between)

Can I Become a Data Scientist in 6 Months? Probably Not, But It Depends

What are data science, big data, and machine learning?

Our Data Science Journey: From Zero to Hero ??

How do Data Science and AI help real estate Companies?