Data Scientists Vs. ML Engineers

Data Scientists Vs. ML Engineers

There’s often confusion between the roles of Data Scientists and Machine Learning Engineers. Although they certainly work together amicably and enjoy some overlap concerning expertise and experience, the two roles serve quite different purposes.Essentially, we are differentiating between Scientists who seek to understand the science behind their work, and Engineers who seek to build something that can be accessed by others. Both roles are extremely important, and at some companies, are interchangeable — for example, Data Scientists at certain organizations may carry out the work of a Machine Learning engineer and vice versa.

To make?the distinction clear, I’ll split the differences into 3 categories; 1) Responsibilities 2) Expertise 3) Salary Expectations.

Responsibilities

Data Scientists follow the Data Science Process, which may also be referred to as Blitzstein & Pfister workflow. Blitzstein and Pfister initially created the framework to teach students of the?Harvard CS 109?course how to approach Data Science problems.

The Data Science process consists of 5 key phases

  • Stage 1:?Understanding the Business Problem
  • Stage 2:?Data Collection
  • Stage 3:?Data Cleaning & Exploration
  • Stage 4:?Model Building
  • Stage 5:?Communicate and Visualize Insights

The majority of the work performed by Data Scientists is in the research environment. In this environment, Data Scientists perform tasks to better understand the data so they can build models that will best capture the data’s inherent patterns. Once they’ve built a model, the next step is to evaluate whether it meets the project's desired outcome. If it does not, they will iteratively repeat the process until the model meets the desired outcome before handing it over to the Machine Learning Engineers.

Machine Learning Engineers are responsible for creating and maintaining the Machine Learning infrastructure that permits them to deploy the models built by Data Scientists to a production environment. Therefore, Machine Learning Engineers typically work in the development environment which is where they are concerned with reproducing the machine learning pipeline built by Data Scientists in the research environment. And, they work in the production environment which is where the model is made accessible to other software systems and/or clients.

Essentially, Machine Learning engineers are responsible for the maintenance of the ML infrastructure that allows them to deploy and scale the models built by the Data Scientists. And, Data Scientists are users of the Machine Learning infrastructure that is built by the Machine Learning engineer.

Expertise

The reason people are confused about the differences between the 2 roles is that there are many places where their skills overlap. For example, both Data Scientists and Machine Learning engineers are expected to have good knowledge of;

  • Supervised & Unsupervised Learning
  • Machine Learning & Predictive Modelling
  • Mathematics and Statistics
  • Python (or R)

The major overlaps between the roles have resulted in some organizations, particularly smaller organizations and startups, merging the roles into one. Thus, some organizations have Data Scientists doing the work of Machine Learning engineers and some have Machine Learning engineers doing the work of Data Scientists. Only leading to more confusion amongst practitioners.

However, there are some key differences between the expertise required for each role.

Data Scientists are typically extremely good data storytellers. Some would argue that this trait makes them much more creative than Machine Learning engineers. Another difference is that Data Scientists may use tools like PowerBI and Tableau to share insights to the business, and they don't necessarily need to use Machine Learning.

Couples that make up for their partner's deficiencies are generally stronger. When you think of it like that, the aforementioned expertise may be weak points for the Machine Learning engineer, who is expected to have a strong foundation in computer science and software engineering. Machine Learning engineers are expected to know about Data Structures & Algorithms and understand the fundamental components that go into creating deliverable software.

With that being said, it's not unusual for a Machine Learning engineer to have a good grasp of another programming language like Java, C++, or Julia.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了