Hidden Aspect About Being a Data Scientist

Hidden Aspect About Being a Data Scientist

Data Science has been an emerging field?that properly commenced a couple of decades back and?picked up momentum in the last decade. As more and more businesses are realizing their potential, it is continuously evolving and changing their shape and form. With Data Science being thought of as a game-changer, an increasing number of players are trying to capitalize and prove its value in driving businesses, it is not surprising that?‘Data Scientist’ is thought to be one of the ‘coolest’ jobs?today. Harvard Business Review published an?article?in 2012 “Data Scientist: The Sexiest Job of the 21st Century” (see below)

With such a focus, this profile naturally attracts a lot of talent and people standing at career crossroads start formulating their idea of what it means to be a Data-Scientist. The?so-called ‘sexy’ aspects are much-publicized and frequently talked about;?however, there are some other aspects that are?quintessential for the job and yet not well-known. I am going to reflect upon both the well-known and the less-well-known (or hidden) aspects of being a Data Scientist based on my journey so far. Let’s begin by understanding what it means to be a Data Scientist.

Data Scientist

Data Scientist, simply put is someone who?studies the data?(i.e. data science). Someone who applies logical (or scientific) algorithms/frameworks to make sense of data. Someone who?processes, analyzes, models, and interprets data?of any kind to drive?meaningful insights?and help?solve business problems?(and in most cases identify some more!).

Data Scientist is someone who literally plays with data and makes it talk.

The skills needed to be a good data scientist would vary from organization to organization, but they are?definitely not limited to ‘only’ the aspects we constantly hear or read about. There are parts of the role which?go much deeper?than what is widely known.

I am going to divide these aspects (and therefore skills needed) into three components, by no means comprehensive, based on my reflections:

  1. The Core aspects— we all hear/read about these.
  2. The layer below — the must-have implicit parts embedded in the day-to-day of the role.
  3. The hidden aspects —doing these ‘well’ will set you up for success.

Let's dive-in..

1. The Core Aspects..

These can be thought of as the?cream at the top of a ‘hot chocolate’ cup, which we see upfront. The most?attractive, talked-about, and well-publicized?parts to being a data scientist. Don’t we all love the visuals of an individual sitting in front of 3–4 screens with all sorts of charts, tables, numbers, and code-lines; and the individual presses a few keys on the keyboard, and things update on all the screen— we have seen it in movies and it attracts us all. These can be thought of as must-have skills to?play with the data.

1.1 Building Models

Perhaps the most popular aspect of the world of being a data scientist is that you get to?build models?which have the?power of predicting future outcomes. Data scientists try out a?plethora of supervised or unsupervised machine learning models?which need to be optimized to achieve a designed outcome. This means having a sound?conceptual and practical knowledge?of the ‘cool’?mathematical/statistical / machine learning concepts?—?which any ‘John-Doe’ cannot have?and it makes ‘data scientist’ to be an enviable job in today’s era.

1.2 Exploratory Data Analysis

One super-exciting aspect of being a data scientist is?exploring the underlying data. Understanding the data in and out, looking at it from all possible angles,?exploring all relations and non-relation?between variables — all help?a data scientist become an expert in understanding human behavior?around his domain.

1.3 Data Visualization

Data visualizing?refers to the skill of plotting your data summaries, in the form of?charts, tables, or plots; with the aim of?making it easier to understand, interpret and find trends.?In reality, it also helps us spot edge cases, look for nuances in the data and find trends; these are very useful while discussing our analysis with stakeholders. The visualizations enable you to understand and eventually?tell a story.

1.4 Data Manipulation

Data manipulation?refers to the?reading, processing, and modification?of information (i.e. data) to make it easier to read, interpret and understand.?It can mean?reading?well-formatted, semi-formatted, or even unformatted files,?cleaning?them left-right and center,?quality-checking?them,?combining?multiple data sources horizontally or vertically — all this to ensure that the underlying data makes sense, is?fit to analyze,?and be used in models.


2. The layer below..

When we look a little deeper, we understand that there are some key aspects to being a good data scientist which is not called out explicitly, but rather are bits that every data scientist does on a continuous basis in order to succeed.

2.1 Brainstorming

As data scientists, we are?actively involved in our projects?but also?indirectly involved in our team’s projects?to add value by acting as QA / review support or senior consult. Brainstorming can be done at various levels. Be it our own projects, our fellow team-mates’ projects, or even other team’s projects — we all learn from each other by?talking about projects, discussing/debating methodologies, writing analytical proposals, diving into the detail of the business problems?at hand, and thinking about it from all possible angles, etc. Best done in an old-fashioned way via?pen and whiteboards, and with the emerging virtual interactions can be done through virtual whiteboards as well. These are the most important interactions we as data scientists have.

Brainstorming to Data Scientists?is like?Fuel to a Car, it keeps them running.


2.2 Repeatable tasks and Delta developments

Data Scientists often develop solutions that are useful for the business stakeholders and often?need to be looked at repeatedly?for decision making.?That is, there is often a need to?reproduce these (with or without minor tweaks)?to help the business. As these solutions ‘mature’ they reach a stage where-in they can be productized or maintenance/reporting teams pick them. However, before this ‘maturity’ stage comes data scientists need to?‘maintain’ these solutions in the interim. This interim period can vary from organization to organization but it is fair to say that there are times when?every data scientist has to maintain some ‘repeatable’ tasks?which are essential for the business. It is important to continuously look for small incremental opportunities to automate these solutions and make them hands-free. This might not be super exciting but it is essential to keep the business running.

2.3 Interpretation and Storyboarding

This is probably the?most important?task for a data scientist and probably one which is?easy to ignore.?But must not be ignored. No matter how good the data science algorithm is; it needs correct?interpretation, contextualization, and eventually explaining?by data scientists in order to get their stakeholders’ buy-in. Only after the stakeholders’ buy-in, the analysis can be put into action, i.e. can start creating the impact it is designed to create. The key idea is to?convey your analysis in the form of a story?and convince your stakeholders to buy in.

2.4 Multiple Stakeholder management

Data science projects often involve a lot of stakeholders; some may be business stakeholders who are responsible for making decisions based on your projects, some might be a team or task-specific technical stakeholders. The crux is many people are involved in some shape or form in your exciting projects.

You win and lose as a group.

Managing these stakeholders so that they stay updated with developments on the project, know about all milestones, roadblocks are essential and?constant communication?plays a key role here.


3. The Hidden Aspects..

Just like a deep learning model, there are ‘hidden’ aspects to being a good Data Scientist.

3.1 Desk Research — Self-learning

Self-Learning is a very essential component of a data scientist’s day. Self Learning in the form of?face to face pieces of training, online courses, reading whitepapers, contributing to discussion forums, or attending workshops/seminars?on the relevant topics — all help data scientists grow their skill set, learn about the nuances of data science techniques and eventually become better at what they do. This enables data scientists to try out different approaches to solve a business problem and then pick out the best approach.

3.2 Knowledge sharing

As data scientists, it is important to?continuously keep learning and evolving. Interacting with other Data Science teams within the business is a very good opportunity to understand what?everyone else is doing,?what new ways/methods?they are adopting to solve old problems, or?what new problems?they are addressing.

It’s a?sea of knowledge around you?in your organization, the question is?how much can you absorb.

It is an?effective way of staying on top of evolving practical applications?of data science. Also,?knowledge sharing is a two-way street?and it is very essential to contribute to the growth of the community by sharing the knowledge you have obtained through projects/training/self-learning, etc. This has some PR benefits for you within your organization as it helps you?raise your profile, which some may debate is not necessary, but in my opinion, it does help you a lot in growing your career profile as an expert, and evolving your knowledge base.

3.3 Developing and nurturing relationships

As a data scientist, along with interacting with your immediate team members, you will also need to?interact with several other teams. You will be exposed to?other data science teams within or outside your function, product teams, data engineering teams, client management teams, etc. Data Science solutions need to work with all these departments to serve their clients.?Developing relationships with other departments?in the business is very important as understanding what they do helps us understand more about the business as a whole and connect all the different pieces of the puzzle everybody is working on to achieve the broader business objectives. It also presents opportunities to seek out and?import best practices?from other teams. It is a very good way to?avoid working in silos?and?think about the bigger picture.

3.4 Networking — We all grow together

A?good network means you get to know about other areas of the business, get more exposure to different teams, methodologies/data science techniques, know about the practicality and applicability of data science algorithms, and have a?well-rounded knowledge of the overall solutions?used in your domain (and possibly other domains too). A portion of networking happens organically when you interact with other fellow data scientists or another department within the business in key meetings?(explained above), but then?informal chats around the coffee machine, while having lunch, waiting outside the meeting rooms, in all company / department-wide socials?— they all help you broaden your network.

We all learn and grow along-with our network.



Let's summarize..

While the knowledge of?statistical/mathematical / machine learning?concepts?and the ability to implement them via?R/Python/SAS/SQL?etc. forms the basis of a data scientist;?intellectual curiosity, communications, business acumen, teamwork and networking?are also quint-essential to succeeding at your role.

It is imperative to focus on both the technical aspects and the softer aspects of the role

This article first featured on the Towards AI Medium blog

Ishaan Gupta

Senior Associate (Retail) at BCN - Bain & Company | ex-dunnhumby

3 年

Great article.

Rachit Verma

Country Manager - AIonOS| ex-OLX Indonesia | ex-Mastercard | Building communities - one interaction at a time!

3 年

Great article Deepak! You should definitely write more.

Diego Solarte Pérez

Principal Data Scientist at AB InBev

3 年

Beautifully written Deepak Chopra!! all hidden aspects are very needed! Specially knowledge sharing and self learning, without them we would not keep on learning in this exciting and evolving field!

要查看或添加评论,请登录

Deepak Chopra的更多文章

社区洞察

其他会员也浏览了