Ain’t No Such a Thing as a "Citizen Data Scientist"?
Credit: https://unsplash.com/photos/unRkg2jH1j0

Ain’t No Such a Thing as a "Citizen Data Scientist"

Dear Aspiring Data Scientist,

Before you start using ‘low code’ or ‘drag & drop’ data science tools, please learn the fundamentals.

Why aspire to be ‘Citizen Data Scientist’ when you can truly become a ‘Data Scientist.’

Don’t get swayed by the fancy titles like ‘Citizen Data Scientist.’ It is funny that so much hard selling is happening in data science.

I mean, just because we know how to use a thermometer or operate BP machine, should we start calling ourselves ‘Citizen Doctor’?

No alt text provided for this image

Image credit : KDnuggets.com

Strategy — undermine the difficulty of doing data science!

The undermining of difficulty in doing data science is not healthy. Many ‘become a data scientist in a 1-month course’ sellers and ‘low code data science solution’ sellers use this strategy.

The ‘low code/no-code solution’ sellers will often argue that one could gain intuition by *doing* things. The counter-argument to that is, using a low code/no-code solution is like using a calculator. Before one can operate a calculator, one needs to have numeracy skills. Learning the fundamentals in data science is like acquiring numeracy skills.

No alt text provided for this image

Image credit: https://www.sciencenewsforstudents.org/article/animals-can-do-almost-math

Why 85 % of Data Science projects fail? (hint: No skin in the game)

85 % of Data Science projects fail in the enterprise because people think it is easy to do data science but only do it wrongly. The realization often comes late.

Many fall victim to the ‘become a data scientist in 1 month/ 6 months type courses’ and often wonder why they are not being hired.

The market is the ultimate truth-teller.

It somehow knows who the good players are and operates an excellent filtering mechanism. The reason being, the market is comprised of companies that have ‘skin in the game.’

Companies having ‘skin in the game’ don’t gamble. They hire genuine talent. The simple ‘skin in the game’ test one can do by themselves is ask one simple question. Would I use the machine learning classifier myself?

I came across a Linkedin post where a person built a heart disease prediction model using one of the low code libraries. The real question is whether that person would use that model on his/her kith and kin?

Also, the real utility of heart disease prediction or earthquake prediction is not the prediction that it will happen with x% certainty, but WHEN will it happen.

This ‘temporal’ part no model can predict accurately.

Doing Data Science is easy. Or is it?

One of the reasons data science seems *easy to do* is because many algorithms can be fit in 2–3 lines of code. There is simply no intellectual pain.

Compare this to programming. A person has to think about the syntax, design pattern, and logic. When things go astray in programming, there are multiple checkpoints in the form of error alerts like Runtime, Syntax error, and compiler error. One gets an immediate reality check on how good or bad a programmer he/she is. As a result, one does not go up and about calling themselves ‘citizen software engineer.’

On the flip side, When it comes to data science, there is no runtime or syntax error equivalent. There are no warning signs that says one can’t apply a particular algorithm on the data. There is no immediate reality check in data science.

This is one reason why people who advocate ‘learning the fundamentals is not important’ go scot-free. This is why fancy but harmful titles like ‘citizen Data Scientist’ arise.

The above criticism might sound rude/bitter, but it is all in the hope that one day we can all say 85% of Data Science projects succeed rather than fail.

I would also encourage the readers to read the articles below:

https://medium.com/@luis.moreira.matias/zero-stack-data-scientist-part-i-beginnings-1691afa2b510

https://www.kdnuggets.com/2016/03/mirage-citizen-data-scientist.html

Your comments and opinions are welcome.

Thank you.

Crystal Sassmann

Junior Data Analyst | Passionate in driving business performance & solving problems | Python | MySQL | AWS

1 年

100%, wish I read this before doing a 5 month DS course. However, I think a 5-month DA bootcamp can actually produce some decent DAs if they already have relevant domain knowledge.

回复
Dan Harris

Chief Revenue Officer (CRO) @ Cloudaeon | Leading GTM and revenue growth. Host of the Data Leaders Executive Lounge.

4 年

Insightful read! I’m seeing many ‘citizen data scientists’ churned out in 6 week post grad courses and posted directly into industry. Low cost yes, but high value? I’m not so sure. Thanks for sharing.

Eric King

Fractional AI Strategist

4 年

My team had laughed at the CDS moniker for years. We even conducted a live online clinic with KDNugget's Gregory Piatetsky-Shapiro on this very topic. At the same time, the high failure rate of analytics projects you cite I believe is more due to strategic, social, mindset and cultural issues than technical ones. I'm not sure that even technically competent data scientists have a grip on, or interest in, the critical soft-skills side of analytics. In fact, our live online analytics clinics revealed that experienced data scientists fully expect and even enjoy spending a large portion of their modeling time on preparing messy data. But even if it means career advancement, they are fully averse to working with messy humans! This will lead to a lack of SME-driven validation, organizational alignment, proper deployment & monitoring, and most importantly adoption. Without adoption, otherwise great models just die on the vine.

Satish H

Senior Business Analyst at Quantiphi | Expert in Machine Learning Algorithms

4 年

Thanks for sharing Venkat, actually I wonder how people believe in 3 months or 6 months when there are lots of fundamentals to be learnt which will take around 6 months and then algorithms and implementation points come into picture.

OTMANE EL ALOI

Data engineer @ TotalEnergies Digital Factory

4 年

It's quite similar to doing data science without mastering the maths behind it! It looks like if data science is just a game of turning up and down dials.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了