7 reasons for rejecting Data Scientist Job Applicants!
- Machine Learning “Skills” under Technical Skills: The moment you list all the machine learning algorithms in existence starting from linear regression to neural networks and for some reason you put LSTM and GRU in them, you may come off as some who is just trying to bypass the ATS system even though there isn’t enough proof that you know them. The way around this is to let your projects show what you are capable of instead of listing everything you know. As told by Richard Feynman, There's a big difference between knowing the name of something and knowing something.
- Kaggle Projects are your only projects: I am not going to bitch about this a lot given that this is the source of practice for people who don’t have/ aren’t pursuing a degree in data science. But we know how Kaggle works. I am not generalizing here, but there are three types of how people do approach Kaggle problems.
A. Let’s learn and do: These are the people who take the dataset and start to play around with it. Their main objective is to get a feel of how to solve data science problems. These people make up 20%.
B. Ummm…that’s how Kaggle works I guess: These people are amazing. I was one of them at first. I wanted to get into machine learning so badly a few years back that I registered in a Kaggle competition, downloaded the dataset, and applied logistic regression. The only things I imported were scikit-learn and numpy and I didn’t even look at the data much. I got results for logistic regression and I was placed somewhere on the leader-board. Then I tried SVM, got better results. Then I tried neural networks and just started manipulating hyper-parameters until the results improved……..and I got bored after a while. This is a terrible way to do data science, or anything.
C.Let’s change the……..kernel: The main objective of these people is to showcase their position on their resume as a Kaggle leader-board leader. They take someone else’s kernel, do some changes (mostly racking up the number of hidden units or adding hidden layers, etc.) and make sure they get a score just about right to land them in the top 5–10% percentile range. Then they list is this as their top headliner project. That tells one of two things - i) You are good at solving problems, but you may never have had an experience in solving or creating a new data science problem, or ii) You are utterly bullshitting by changing the someone’s kernel.
D. So the final point is it is okay to list Kaggle projects on your resume. But make sure that it is something that you did from the scratch and you are able to defend any types on questions regarding that or make a separate section on your resume on the side just dedicating to Kaggle projects.
- Let’s make a big deal out of everything: When you are graduate school, if you make the right choices in your course selection, you get to do good assignments that will help you understand smaller things. A simple homework question would be something like visualizing word embeddings using PCA/T-SNE. It is good that you know how to do it, but if you blabber on your resume like you created some in the range of Tensorflow’s embedding projector, you are kidding yourself. It is good to know that you know how to do ICA (Independent Component Analysis), but you don’t have write it in a way that makes you look like you solved a source separation problem.
- Operation Generica: I’ll keep this straight. Please don’t list MNIST on your resume and brag like you got 99.2% accuracy and please don’t list Coursera’s course assignments (Capstone’s are fine though).
- The Metrician: These are the people who list metrics without having a relative comparative. These statements on the resume look something like “Achieved an accuracy of 92% on the test set”. Ummm…good for you, but what is the baseline and what is the state-of-the-art score for this dataset, and where do you land? In my opinion, it is better to list something like that as, “Achieved a 4.6% improvement in accuracy compared to the state-of-the-art models”. On a side note, this is something I keep noticing from people who are beginners in the field. “Achieved an accuracy of 46% on [some binary classification problem]”. Sure, but random guessing gives me better odds than that.
- Let’s paraphrase it: This is something I saw in a few resumes. These are the people who forcefully try to convert and rephrase their previous experience (as a software engineer/business analyst) to show that they did “data science” in their previous job. Their job descriptions look like they have been doing data science all along under the disguise of a software engineer, but that is, most times, clearly noticeable.
- A more sensible point here about why you might be rejected for a data science position. Not everything that says “Apply for Data Scientist Position” is an actual data scientist position. Look at the job responsibilities before you apply for the job. Not every data scientist has the same set of skills and they shouldn’t. So when we review resumes, we want to find someone who is a good fit for our team and the projects we are working on. If you have amazing skills but we don’t know what to do with you, we are just going to reject you. So, just make sure your resume actually answers the responsibilities that companies are looking for in the job posting.
Thanks for sharing.