How to break into Data Science the easy way

How to break into Data Science the easy way

Scratch that; there’s not an easy way.

Data science has become a hot topic the past few years along side machine learning. The rise of machine learning has made data king, and as a result, there is a huge demand for data scientists. Becoming a data scientist through formal education is a product of the times, and to get into the industry in the modern era requires a bit of work. 

Traditional Route

I am a data science, but I do not have a degree that says Data Science. I became a data scientist as a result of working with a lot of data. I have a Ph.D. from the pre-historic days where one had to read academic papers to implement machine learning. Traditionally, if you want to become a data scientist, you became a scientist. Then you deal with so much data that you became a data scientist to be able to analyze all that data. You started by majoring in STEM in undergrad, and then you’d go to a graduate program. The program wasn’t focused on data science, but as a result of your research, you handled a lot of data

After years of data in graduate school and then some job(s) in your related field, you will have built up an acumen of a data scientist. You will have learned techniques for analyzing data and being confident in your results. This experience comes from working on academic papers and then applying these skills to industry.

Modern Era

Now the industry is hot! It seems like everyone wants in, and the prevalence of nano-degrees has given many newcomers the impression there are short cuts to become a data scientist.

No alt text provided for this image

There is not an easier, softer way to become a data scientist. One of the issues of trying to short cut the process is that you don’t learn how to look at data. Even though the concepts can be taught in short order, you need to look through lots of data, design data collections, collect data, clean data, train on data, analyze data, do failure analysis, and repeat. Graduate school is a great vehicle for this process because you have to produce something better than what’s out there currently. 

No Short Cuts

Toy data sets are easily available these days, and even machine learning algorithms can be gotten essentially off of the shelf. This gives people a sense of ease when it comes to being able to take the training wheels off and apply this to new datasets. 

There in lies the problem: with a nano-degree, you may feel like you have accomplished something great and greatly learned, but you’ve only been introduced to the field. A nano-degree in data science is great for someone already steeped in data due to graduate school or their day job. 

At some point in doing this process, one evolves into a data scientist.

Many people also get master’s degree’s in data science, and while being a generalist is great, I still prefer someone who has a depth of field in at least one field. If you want to break into data science, consider getting a degree in an area that’s interesting to you and uses a lot of data. You can learn a lot of the data science stuff on the side or as part of the journey. 

The best data scientists are the ones that love data throughout their lives. I know that sounds like some people have a natural inclination towards it; that has been my experience even though it is not everyone’s. I have found ways to use data to improve my life like budgeting, buying a car, determining when to leave a company, making espresso, and assessing the impact of articles I write. This is very natural to me and doesn’t feel like work.

Something to Consider

The majority of the people who have been data scientists until the past 2 or 3 years had masters’ or Ph.D.’s. To them (to us), we can see the difference between people with skin deep expertise and a great depth of field. 

Even a newly graduated Ph.D. will not be called senior for a few years. So if you come out of a master’s or a bachelor’s, do a nano-degree, and within a year or two have the title of Senior Data Scientist, I’m skeptical. Even though you may want to think you have the same depth of field as other senior members in the field, most likely, you don’t. That’s okay; just be realistic about your skill level. 

No alt text provided for this image

There’s a reason my hiring search for a data scientist in 2018 failed: I couldn't find a good one. One could argue I passed over good candidates, but hiring is usually by committee consensus. Everyone on my interview panel has a master’s degree, and half have a Ph.D. They want to work with reliable people whose skills they trust, so they error on the side of saying no. Out of 100 applicants, 40 phone interviews, and 6 in-person interviews, I ended up nobody. 

No alt text provided for this image

In graduate school, my advisor told me that they have to be careful who they graduate with a Ph.D. because a new Ph.D. could be graduating other Ph.D.’s within a few years. So the cycle is short, and unqualified candidates will dilute the field. The same is true in data science: as less qualified people enter the field, they will gladly let more people of similar caliber in. 

In Closing

Part of the difficulty of breaking into Data Science is simply the years it takes grinding away before people trust you to do the job of a data scientist. There’s no free lunch and no shortcuts, so work hard on some interesting problems, ingest all that data, and one day, you will form a cocoon and pop out a data science butterfly.

---------------------

If you like, follow me on Twitter and YouTube where I post videos espresso shots on different machines and espresso related stuff. You can also find me on LinkedIn

Further readings of mine:

Data Science: Essentials

Abandon Ship: How a startup went under

Dissertation Regret

Part of the Team

How to Interview a Company

Thoughts on Leaving

A Day in the Life of a Data Scientist

Design of Experiment: Data Collection

John Vicente

Professional Services industry at Gartner | ex-Deloitte, IBM, Intel

5 年

Nice graphics to supplement article! Was wondering why in the flow chart the “enough” decision point went back to clean data instead of collect more data... seemed to make more sense, but am just curious. Another curiosity I had was the article didn’t seem to focus much on pragmatism and applicability. I’ve met plenty of “data scientists” in my own practice and especially while consulting... which offered near zero value to clients. They got stuck on which neural net or other svm machine they used and couldn’t pursuance a business person of the validity or use of model. But as you said, deep knowledge in the domain you practice is key because programming, stats programs, and ML algos are pretty easy to pick up in comparison. Also, I Would love to learn how your espresso improved as result of your experiments!

回复
Bonnie Barrilleaux

Data science ?????? | Ex-LinkedIn

5 年

Whatever the easy way is, I’m pretty sure my 10 year PhD+postdoc wasn’t it. ??

要查看或添加评论,请登录

Dr. Robert McKeon Aloe的更多文章

  • Ph.D. Interviews

    Ph.D. Interviews

    I have interviewed mostly Ph.D.

  • ML: Examining the Test Set

    ML: Examining the Test Set

    I recently saw a post where someone said “Never touch your test set.” The theory was that you (as the algorithm…

    8 条评论
  • Privacy in Machine Learning: PII

    Privacy in Machine Learning: PII

    Privacy is not a value explicitly written into the US Constitution, but the essentials are there. As a democratic…

    1 条评论
  • Mastering LinkedIn

    Mastering LinkedIn

    Account Creation I never had a LinkedIn account until I was searching for a job, and then I only paid attention to it…

    1 条评论
  • Withdrawing a Conference Paper

    Withdrawing a Conference Paper

    In graduate school, I tried all sorts of optimizations aimed at making my face matcher work better and faster. I found…

    1 条评论
  • Thoughts on Leaving

    Thoughts on Leaving

    Relax, I’m not leaving my current job right now. I’ve been writing about many different aspects of my work experience…

  • Crashing the Student Computer Lab

    Crashing the Student Computer Lab

    In my last year of graduate school at Notre Dame, I used over 1,000,000 computer hours or just over 114 years of…

    3 条评论
  • Presentation Essentials

    Presentation Essentials

    I have fallen asleep in my fair share of presentations, and I’ve worked hard at making sure my presentations are not…

  • Design of Experiment: Data Collection

    Design of Experiment: Data Collection

    Anyone can collect data; some people can collect good data. The key theme to any good data collection is data…

  • Preserving LinkedIn for Professionalism

    Preserving LinkedIn for Professionalism

    I recently saw a discussion on LinkedIn about LinkedIn possibly becoming more like Facebook and how that was…

社区洞察

其他会员也浏览了