登录查看更多内容

Data Science Industrialization, what is next?

Fran?ois Rosselet

Data Architect @ Cargill | AI, DataOps, Data Mesh, AWS, Snowflake, Knowledge Graphs, Agentic GenAI

发布日期: 2019年5月5日

"Data Scientist: the sexiest job of the 21st century", this was how data science was percieved in 2012, I am writing it again: 2012. Why? because in 2012, connecting data from ultra-siloed systems was a real out-of-the-box performance, with a fantastic potential of adding value to numbers of business.

In 2012, we can see the data scientist as the IT dude of the 80's, as single guy managing the whole IT problems, no front, no backend, just IT. So was the data scientist in 2012, using inference techniques and machine learning to learn always more from data, out of the classical tools which were tailored to solve already known problems.

What happened since that? Data Science went through the entire Gartner cycle, and as the IT guy of the 80's, the data scientist have been split in numbers of distinct roles: machine learning engineer, data engineer, devops, BI developer, etc.. For agility reasons? Of course no, enterprise are still fond of the Taylor and Ford's model: Take a task, divide it into the optimal number of micro-tasks, make it executed within the optimal timeframe etc...

Source: iStockphoto

Good, enterprises started to industralize data science. Some of them eventually started to build full-operational data science supply chain in the Taylor's style: Data engineers caring of making data flow optimally, machine learning engineers building optimal predictive models, devops pushing all of it to production, optimally. Where is the data scientist? As a multi-skilled guy, he naturally lost a lot of its flavor in the middle of this crowd of hyper-specialists.What really happened inside enterprises? From what I have seen, data science has just been an additional sequence of work, or an additional silo inside the enterprise. This is where data science started to go down the desillusion slope of the Gartner cycle.

And now? What needed months to be implemented on-prems just take a few hours or less to launch in the cloud. Data science and AI are now available as micro-services highly available, resilient, scalable, no need to re-invent the wheel, GAFAMs have probably gathered the best talents in these fields and they are making data science available to everybody through powerful micro-services. Recently, the AutoML service from Google Cloud Platform was ranked within the top 5 of a Kaggle competition...

What should we learn from that? We should learn that the sexiest job of this century was build by multi-skilled people who were thinking out-of-the-box, guided by their intuition and following their own vision to build something that did not exist. This is why nobody was never successful in correctly defining what data science is, I saw hundreds of different Venn's diagram trying to impose a synthesis of what data science is, this is just absurde in a world where everybody is chasing a single version of a truth. Here is my version:

What to conclude? Maybe should we be careful as we are industrializing data science, doing it on-prems has less and less sense today, except if there are no other choices. We also should be careful as we are transfering data science toward a supply chain because:

Technologies are evolving faster than most implementation processes, so the risk of being already outdated once ready for production is really high if you are not sticking to the latest agile and DevOps methods and tools.
Do not forget the world 'cycle' at the end of 'supply chain'. Is your infrastructure agile enough to follow and accomodate the latest technologies in your business? Are your human resources agile enough to adapt to the next change of paradigm, because it should not sound unlikely...

More generally speaking, we should definitely give more credit to soft skills nowadays. Data science was disruptive, other disruptures are to come for sure and this won't have anything to do with technical skills.

Data Science Industrialization, what is next?

Fran?ois Rosselet

Data Architect @ Cargill | AI, DataOps, Data Mesh, AWS, Snowflake, Knowledge Graphs, Agentic GenAI

更多精彩文章

社区洞察

其他会员也浏览了

How Is Data Science Changing The World?

Is Data Technology? The Great Debate!

Real-World Data Science: Skills You Need for Success

Part 2: Three DataOps Challenges That Most Computer Vision Teams Struggle With

Data Science: The 10 Commandments for Performing a Data Science Project

Understanding the Types of Data Professionals: Where Do You Fit?

DATA SCIENCE VS. DATA ANALYTICS VS. MACHINE LEARNING

From Silicon Valley to Silicon Valley: how did I turn my 3rd idea into a company?

What are the Essential Tools of Data Scientists? It’s popular Software & Libraries

What is a recovering data scientist?

Could Information Technology & Applied Mathematics be the 7th Nobel Prize category?

2018年3月16日

Deep Learning or How I Realized I could definitely Shift my Learning Curve far from an Eventual Plateau

2017年10月4日

Step 9/10 done

2017年3月23日

Oil & Gas Data: Time is opportunity and money

2016年12月21日

A Great Lesson of Grit….

2016年11月18日

Data and our comfort zone

2016年10月7日

The Benefits of a Zig-Zag Curriculum, part 2

2016年8月10日

The Benefits of a Zig-Zag Curriculum, part 1

2016年8月1日