Tips for Making It Through your First Technical Data Science Interview
Dr. Jennifer Prendki
Head of AI Data @ Google DeepMind | Data-Centric AI, Data Governance, Data Science, AI Infra, MLOps/DataPrepOps
I recently wrote an article on LinkedIn with some useful tips for people trying to get their very first job as a data scientist. Today, I would like to share tips that I am hoping can help the same people during their very first technical interview.
It sometimes takes a while for entry-level data scientists to hear back from potential employers, regardless of their background and level of qualification. This can be highly frustrating, especially when they have spent months improving their resumes and building practical experience. If you are one of those applicants, start preparing for your first interview rather than sitting in front of your computer waiting for some good news: you worked hard to get in front of a potential employer, so make it count and put your best foot forward.
One of the best kept secrets is that nailing an interview goes beyond getting every single answer right – it is very frequent for a hiring manager to speak to candidates who give correct answers but still leave them underwhelmed because of their lack of communication skills (I had a candidate once who spent one hour writing down equations without ever addressing me!), or by the drabness of their solution.
Below is a list of do’s and don’ts that should help you better understand what hiring managers are looking for, and hopefully help you get hired faster.
1. Don’t pretend to know it all
If you are serious about getting into the data science business, get to terms with the fact you can’t, and never will, know everything. Data Science is a fast-evolving field, and no one will ever master every single algorithm, library and ML technique there is. Data Science is maturing, and everything also tends to indicate that, in the near future, what is now called Data Science will be split across multiple disciplines, such as Machine Learning Engineering, Data Engineering, Analytics, etc. As a newbie, coming to an interview and claiming you master every aspect of the job ranging from data streaming and MapReduce to Spiking Neural Networks is nothing short of preposterous, and you are likely to infuriate more than one interviewer if you do so. In fact, showing that you know what you don’t know is usually viewed as proof of maturity and wisdom. Be transparent and honest, and be ready to admit your ignorance if your interviewer asks you about an algorithm you had never heard of before.
2. Don’t be a one-trick pony
If you just used Gradient Boosting for your Kaggle challenge last week, you will be naturally biased and tempted to use it again: after all, it worked well for your use case, and that’s a topic that you have a good understanding of. But you know how the saying goes: when all you have is a hammer, everything looks like a nail! So resist the temptation and don't get there! Using an algorithm arbitrarily not only isn’t likely to work, it will also make you look like you keep using the same approach because that is literally the only algorithm you are comfortable with. When you pick an algorithm, justify why you believe this is the right one, and always strive to show versatility when giving your answer.
3. Prove that you can be autonomous
You might be lucky and know the answer to the question you’re asked (especially if you interviewer uses questions from a database). But let’s be honest: this is not only unlikely, but also undesirable because you would be less likely to stand out with your solution (as other candidates likely prepared for the same questions). So the odds are, you will have to work your way through an answer. Talk through your thoughts, break the problem into smaller pieces (feature selection, feature engineering, algorithm development, validation, etc.), list possible solutions and explain what you like or don’t like about them. Every problem is different, but the overall process stays roughly the same, and showing that you understand that process is key to convincing your interviewer that you are ready to solve the company’s toughest problems. One skill that I am personally looking for in any junior candidate is his/her ability to generalize the learnings from their past projects to a new problem, and this is actually a skill that very few entry-level data scientists demonstrate.
4. Show that you want to learn, and can learn fast
In the event, though, that you get stuck, use this as an opportunity to show your willingness to learn, and your ability to adapt. Explain precisely why you are struggling (“It is clear that the high dimensionality of the data makes it difficult to use algorithm X, however I don’t know many algorithms that can deal with so many features”, or “I expect these features to be highly correlated, and I don’t think my approach will play nicely under these conditions”), and use this as an opportunity to ask the interviewer to teach you something new. Then, show him/her that you are quickly able to take that input and run with it in order to design a solution. Some interviewers might be disappointed that they had to drop a hint, however, I have personally ended up hiring people not in spite of it, but because I was impressed to see how quick they were to grasp a brand-new concept that they knew nothing about just a few minutes before. Generally speaking, getting to a solution with some help is better than not getting a solution at all, so be willing to ask for help.
5. Be pragmatic
As Baron Schwartz, CEO of VividCortex once tweeted, “when you’re fundraising, it’s AI; when you are hiring, it’s ML; when you are implementing it, it’s linear regression […]”. This is an amusing way of saying that even though most data scientists love to play with fancy Deep Learning algorithms, the solution that companies need to implement usually require fairly low tech approaches at first. Showing that you know about complicated algorithms might be tempting and make you feel like it will get you hired, but going for the easiest solution will actually prove to your interviewer that you are results-oriented and efficient. Rather than showing that you can implement complex solutions, show that you can deliver a solution that will solve the problem of your customer, and focus instead on proving that you pay attention to sneaky little details and corner-cases.
6. Communicate
If you are dying to tell your interviewer about the cool algorithms that you know about, you can still use that to your advantage. Instead of implementing them, offer a simple but powerful solution, and then elaborate on improvements and discuss more options with him/her. “I believe that a solution using algorithm Y would help get a better accuracy, even though I suspect it would require a much larger training sample” will show that you are aware of the benefits and limitations of the newer ML techniques, that you are curious about the field, but also that you are able to choose the simple solution when there is one readily available.
7. Show your critical thinking capabilities
Finally, one of the most important features of any data scientist is the ability to think critically. Show that you are never satisfied, and be ready to question and challenge your own solution: “I wonder if my solution might be biased. I don’t think my approach takes care of correlations very well”. Use this as an opportunity to show that you also know which effects are negligible, and offer ways to validate your suspicions. After all, don't forget that as a data scientist is a scientist at first, and he/she must be willing to use the scientific method as part of his/her daily routine. I definitely would never trust a data scientist who delivers a solution claiming that he/she is done.