Advice for Aspiring Data Scientists

Advice for Aspiring Data Scientists

Data scientists are in high demand, for those that are thinking about getting into this field here is some advice.

Don’t try to learn it all at once

Take your time and don’t try to learn multiple languages (e.g both Python and R), don’t start with something complex (like machine learning), make sure you learn some of the basics before you jump into something too difficult. 

This was a lesson that Vaibhav Gedigeri from Honeywell had to learn early on. “That’s the first mistake I made, I wanted to put my hands on everything. I was like a small kid, you know you give them a lot of toys and they are like I want that toy. That’s exactly what I was doing, I wanted to learn Python, I wanted to learn SAS, I wanted to learn big data, Spark, etc. and it got me nowhere. Then I decided I have to focus on one or two tools; since then I decided I will only work on open source tools. I gave up on SAS and focused pretty much on python and R, and I progressed moving into Spark.

Vaibhav Gedigeri - Data Scientist at Honeywell

I can personally relate to this. When I started learning R, my first project that I gave myself was to try and pull down some Twitter data using R. I ran into an issue that was preventing me from being successful and I was talking to a data science colleague of mine. She mentioned that she’s done this with Python before and has some script that can help me. When I asked for it she told me to stick with R as Python will have its own issues and I’m better off learning one language at a time. I’m very glad I listened to her because within an hour of our conversation I was able to address the issue and execute on my project successfully. Side note – the issue I was running into was actually due to the fact that my Twitter App wasn’t properly working; it wasn’t even an issue with R!

Learn the underlying logic

It is important to have an understanding of what takes place under the hood of the algorithms; this can help you avoid erroneous output. You will be able to address issues as they come up and actually understand what you are trying to do.

“To answer the business problem is more important than the tool knowledge. It is important that you know the tools and to use them; but only focusing on the tools, only focusing on how to start programing on Python or other tools is not as important. I mean that is a very simple part, what is important is to understand how to reduce the problem, how do you know which variables to choose, how do you know which algorithm to apply, when to use machine learning and when not to use it – this is the real value of a data scientist.

It is still important to start off with fundamentals, I started with two years of reading about mathematics and statistics and it actually helped me to understand the mechanism of how algorithms works. When I know what’s under the hood and how it works, it makes me feel more comfortable.”

Vaibhav Gedigeri  - Data Scientist at Honeywell

Learn by doing projects

Several areas where you can find projects to do with data sets, Kaggle is one of them.

Study job descriptions of the types of roles you would want to have and get an idea of what is required. This way you will be able to know what skills are needed and what experiences are preferred. You can basically work backwards to engineer your own skills based on what is in demand.

When you execute on projects or Kaggle competitions, document the steps taken and approach, as well as lessons learned to build up a project portfolio that you can show to recruiters and hiring managers. It is also a great way to keep track of what you are learning and to leverage what you learn in future projects.

Andrew Paul Acosta, from Milesius Capital Resources, has the following advice, “Number one, you need to practice. So, find a project to work on. You need to understand how to solve data problems; and then once you figure that out, move on to another one. The second thing is to keep updated on current technologies, languages and so on. You don’t have to know everything about R, or Python but know enough about the libraries available and uses of the tools and have an opinion.  Lastly, you need to have a natural curiosity; don’t be satisfied with the answers that you find, continue to ask questions and be curious.

Andrew Paul Acosta- Data Scientist at Milesius Capital Resources, LLC

For additional advice from data scientists, you can read Journey to Data Scientist (link to book on Amazon).

Bennett Bullock

Senior NLP Engineer

7 年

Learn the holy trinity of math - linear algebra, regression, and probability. Also, don't call yourself a "Data Scientist". The term went out of fashion years ago.

Duygu Altinok

NLP Researcher | Author

7 年

Stop using the word "aspiring", sounds really very dumb.

Ravi Yadav

Machine Learning Scientist | AI Researcher

7 年
Pawan Mishra

Lead Data Scientist | Generative AI - EY

7 年
Andi Shehu, PhD

Senior Applied AI Scientist

7 年

Solid advice!

要查看或添加评论,请登录

Kate Strachnyi的更多文章

  • DATAcated at Gartner Data & Analytics Summit 2025

    DATAcated at Gartner Data & Analytics Summit 2025

    This was my 2nd year attending the Gartner Data & Analytics Summit, and it truly felt like LinkedIn in real life—so…

    19 条评论
  • Salesforce TDX 2025 Highlights

    Salesforce TDX 2025 Highlights

    Last week, Salesforce hosted developers, admins, architects, partners, entrepreneurs, and IT leaders, in San Francisco…

    1 条评论
  • IBM at Gartner Data & Analytics Summit 2025

    IBM at Gartner Data & Analytics Summit 2025

    I had the opportunity to attend the Gartner Data & Analytics Summit 2025 in Orlando, and IBM’s presence was impossible…

    30 条评论
  • Salesforce TDX - San Francisco / Virtual Event

    Salesforce TDX - San Francisco / Virtual Event

    If you’re in the Salesforce ecosystem—or just love keeping up with the latest in AI-driven development—you cannot miss…

    8 条评论
  • SAP Business Unleashed - Key Updates

    SAP Business Unleashed - Key Updates

    In case you missed it, SAP just had some MAJOR announcements! I was truly honored to be invited to their SAP Analyst…

    18 条评论
  • 5 Types of People Who Will Benefit Most from Gartner Data & Analytics Summit 2025

    5 Types of People Who Will Benefit Most from Gartner Data & Analytics Summit 2025

    The Gartner Data & Analytics Summit 2025 is happening from March 3–5 in Orlando, Florida. It’s one of the best events…

    2 条评论
  • How SAP is Unleashing the Future of Business

    How SAP is Unleashing the Future of Business

    If you want to hear about where enterprise technology is headed, there’s a virtual event coming up that might be worth…

    4 条评论
  • Stop chasing the next data fire

    Stop chasing the next data fire

    I recently had the chance to chat with Roy Daniel, co-founder and CEO of definity and former product executive at FIS…

    4 条评论
  • 3 reasons to attend the Gartner Data & Analytics Summit 2025

    3 reasons to attend the Gartner Data & Analytics Summit 2025

    The Gartner Data & Analytics Summit 2025 is the place to be if you’re into data and analytics. I’m attending their…

    21 条评论
  • Updates from Agentforce 2.0 Launch

    Updates from Agentforce 2.0 Launch

    Earlier this week, I partnered with Salesforce and attended their Agentforce 2.0 launch event in San Francisco.

    16 条评论

社区洞察

其他会员也浏览了