5 Topics you will definitely face during Natural Language Processing related interviews.

5 Topics you will definitely face during Natural Language Processing related interviews.

5 Topics you will definitely face during Natural Language Processing related interviews.

Gone are those days when you were expected to create an NLP model from scratch. Nowadays companies do not have that time or bandwidth, or the resources to expense for build anything from scratch. Of course, unless those companies are Facebook or Google or Nvidia or Amazon et, al. Most companies, often the mid-sized ones would rather prefer a ready-made tool to do the necessary task and for which they are willing to pay a subscription fee or rent the tool. This is much more economical and can use reliable state-of-the-art tools for their project. I realized it after failing 8 interviews out of 10.

As we very well know, there are 90% of companies who are mid-sized and would generally prefer to use a ready tool instead of spending effort in research. That also means they are the major job provider. Hence a major part of the interview preparation should be towards learning those tools which you can directly use and apply to day-to-day tasks.

In this article, we're going to discuss a very important topic which is: in order to get a job, say in areas of NLP, you need experience with language models such as BERT or ELMo, etc but you also do not have the time or material to learn all of it. It would at least take 3 months to learn and implement a simple BERT based solution. But the biggest problem is, in order to understand every bit of BERT includes being well-versed with several other concepts such as creating ML models in Tensorflow, working with data APIs, knowing about word embeddings, working with Transformers, and attentions models, etc. So the smarter way is to first get some hands-on experience before actually understanding all of it. Let's use human muscle-memory to aid us.

Now, let's look into where can we find these hands-on guided projects and what can we learn so that we are capable of answering questions related to these topics, but also be confident to add them in our resume to get it shortlisted in the first place. The good thing is you won't have to spend months or even 1000's of $$$. These 5 topics can be learned in less than 10 hours and well below 49$ by doing these 5 hands-on projects. I would also request you to share your experience with ML interview and please post them in the comment section.

Each project is less than 2 hours and can save you months to learn these important and essential topics.

Topic 1: Use BERT models for text classification (Project)

In this project, you'll learn

  1. Build TensorFlow Input Pipelines for Text Data with the tf.data API. Tensorflow 2 is much easier to learn than Tensorflow 1.
  2. Tokenize and Preprocess Text for BERT
  3. Fine-tune BERT for text classification with TensorFlow and TensorFlow Hub. If you're interested you can also learn how to build text-classification from scratch.

This is a guided project on fine-tuning a Bidirectional Transformers for Language Understanding (BERT) model for text classification with TensorFlow. In this 2.5 hours long project, you will learn to preprocess and tokenize data for BERT classification, build TensorFlow input pipelines for text data with the tf.data API, and train and evaluate a fine-tuned BERT model for text classification with TensorFlow 2 and TensorFlow Hub. Why is this important in the current industry, because in order to create an NLP model not only takes a team of the researcher to create ML models but also is not matured enough to be used in production level. Hence companies rather want to use a more reliable model created by tech giants such as Google.

Topic 2: Deploy Chatbots using Django web framework (Project)

Knowing how to create a web app is extremely important for a professional data scientist. Although you may not be creating yourself web apps all the time, most likely there will be someone in the team who has more experience as a web developer, but you should at least know how they are created. It will give you that edge as a data scientist. There are many opportunities where hiring companies expect you to know how to deploy Machine learning solutions to production so that they can cut costs on hiring a web developer. If you add these skills to your resume and also talk about the web apps in the interview you have a higher chance of getting selected.

Also, these skills will give you an edge if tomorrow you want to create a small product for yourself and making money out of it. Chatbots or Question-Answer systems are being widely used these days, and their demands are only going to increase.

In this 2-hour long project-based course, you will learn how to create a Django web app. You will learn how to create forms, models, views, and templates in Django, and how to deploy a machine learning model on a Django app. You will use the Wikipedia API to search for topics.

I would also encourage you to learn how Django communicates with a database through model objects. You should know Object-Relational Mapping (ORM) for database access and how Django models implement this pattern. You can learn about Object-Oriented (OO) pattern in Python. You will learn basic Structured Query Language (SQL) and database modeling, including one-to-many and many-to-many relationships and how they work in both the SQL and Django models. You will learn how to use the Django console and scripts to work with your application objects interactively.

Course on: How to create web apps using Django web-framework.

Topic 3: Sentiment Analysis with Deep Learning using BERT. (Project)

Either it is knowing how a customer feels about a product or finding well in advance whether a customer is going to leave the product a bad review or the customer moves to a competitor, in most of these cases a company would like to know what is the sentiment behind that decision. Companies are more worried about negative sentiments than being euphoric about positive ones. It is a very challenging problem in NLP area. That's where pre-trained models such as Google's BERT come in handy where they have already trained these models in using a vast amount of data including Wikipedia, which I and you can't do it single-handedly. We just have to take these pre-trained models and tweak them as per our requirement, much more economical, easier to maintain, and of course, there is some sort of assurance using such models.

In this 2-hour long project, you will learn how to analyze a dataset for sentiment analysis. You will learn how to read in a PyTorch BERT model and adjust the architecture for multi-class classification. You will learn how to adjust an optimizer and scheduler for ideal training and performance. In fine-tuning this model, you will learn how to design a train and evaluate a loop to monitor model performance as it trains, including saving and loading models. Finally, you will build a Sentiment Analysis model that leverages BERT's large-scale language knowledge.

It is also good to have skill to know one of the deep learning frameworks such as PyTorch although not mandatory. Knowing any one of them is fine. If you're comfortable with Tensorflow, you won't have to become equally good at PyTorch.

Topic 4: Deep Learning NLP: Training GPT from scratch. (Project)

In this 1-hour long project-based course, you will explore Transformer-based Natural Language Processing. Specifically, you will be taking a look at re-training or fine-tuning GPT-2, which is an NLP machine learning model based on the Transformer architecture. You will learn the history of GPT-2 and its development, cover basics about the Transformer architecture, learn what type of training data to use and how to collect it, and finally, perform the fine-tuning process. In the final task, we will discuss use cases and what the future holds for transformer-based NLP. I would encourage learners to do further research and experimentation with the GPT-2 model, as well as other NLP models!

Transformer architecture is becoming increasingly popular and also more efficient in the task NLP performs. Hence it is important to have solid hands-on experience before going for that machine learning interview.

Topic 5: TensorFlow for NLP: Text Embedding and Classification. (Project)

In this 2-hour long project-based course, you will learn the fundamentals of Text Embedding and Text Classification, and you will learn practically how to use text embeddings for a classification task in the real world and create, train, and test a neural network with Tensorflow using texts, and you will get a bonus deep learning exercise implemented with Tensorflow. By the end of this project, you will have learned text embedding and created a neural network with TensorFlow on text classification.

We all know NLP is cool, but what makes it cooler is how to represents words in such a way that a machine can use it for learning.

A word embedding is a learned representation for text where words that have the same meaning have a similar representation. It is this approach to representing words and documents that may be considered one of the key breakthroughs of deep learning on challenging natural language processing problems. In almost every interview you will get asked how to represent words and that's when the focus will be on text embeddings. Basically, without knowing how to represent words, you technically can't do machine learning on texts.

In short, the 5 topics which are commonly asked in interviews these days are related to sentiment analysis, pre-trained models such as BERT or GPT or ELMo etc, text embeddings, chatbots, and attention models. Although you might not face all these 5 topics in a single interview, yes mix and match them.

I hope this article will help you point in the right direction while preparing for the next machine learning interview, esp. if the focus is on NLP.

If you would like to read more of my articles, please follow: www.bikashdebnath.com

 

 

 

要查看或添加评论,请登录

Bikash Debnath ?的更多文章

社区洞察

其他会员也浏览了