登录查看更多内容

5 Topics you will definitely face during Natural Language Processing related interviews.

Bikash Debnath ?

Associate Director for AI | Nuveen

发布日期: 2021年2月8日

5 Topics you will definitely face during Natural Language Processing related interviews.

Gone are those days when you were expected to create an NLP model from scratch. Nowadays companies do not have that time or bandwidth, or the resources to expense for build anything from scratch. Of course, unless those companies are Facebook or Google or Nvidia or Amazon et, al. Most companies, often the mid-sized ones would rather prefer a ready-made tool to do the necessary task and for which they are willing to pay a subscription fee or rent the tool. This is much more economical and can use reliable state-of-the-art tools for their project. I realized it after failing 8 interviews out of 10.

As we very well know, there are 90% of companies who are mid-sized and would generally prefer to use a ready tool instead of spending effort in research. That also means they are the major job provider. Hence a major part of the interview preparation should be towards learning those tools which you can directly use and apply to day-to-day tasks.

In this article, we're going to discuss a very important topic which is: in order to get a job, say in areas of NLP, you need experience with language models such as BERT or ELMo, etc but you also do not have the time or material to learn all of it. It would at least take 3 months to learn and implement a simple BERT based solution. But the biggest problem is, in order to understand every bit of BERT includes being well-versed with several other concepts such as creating ML models in Tensorflow, working with data APIs, knowing about word embeddings, working with Transformers, and attentions models, etc. So the smarter way is to first get some hands-on experience before actually understanding all of it. Let's use human muscle-memory to aid us.

Now, let's look into where can we find these hands-on guided projects and what can we learn so that we are capable of answering questions related to these topics, but also be confident to add them in our resume to get it shortlisted in the first place. The good thing is you won't have to spend months or even 1000's of $$$. These 5 topics can be learned in less than 10 hours and well below 49$ by doing these 5 hands-on projects. I would also request you to share your experience with ML interview and please post them in the comment section.

Each project is less than 2 hours and can save you months to learn these important and essential topics.

Topic 1: Use BERT models for text classification (Project)

In this project, you'll learn

Build TensorFlow Input Pipelines for Text Data with the tf.data API. Tensorflow 2 is much easier to learn than Tensorflow 1.
Tokenize and Preprocess Text for BERT
Fine-tune BERT for text classification with TensorFlow and TensorFlow Hub. If you're interested you can also learn how to build text-classification from scratch.

This is a guided project on fine-tuning a Bidirectional Transformers for Language Understanding (BERT) model for text classification with TensorFlow. In this 2.5 hours long project, you will learn to preprocess and tokenize data for BERT classification, build TensorFlow input pipelines for text data with the tf.data API, and train and evaluate a fine-tuned BERT model for text classification with TensorFlow 2 and TensorFlow Hub. Why is this important in the current industry, because in order to create an NLP model not only takes a team of the researcher to create ML models but also is not matured enough to be used in production level. Hence companies rather want to use a more reliable model created by tech giants such as Google.

Topic 2: Deploy Chatbots using Django web framework (Project)

Knowing how to create a web app is extremely important for a professional data scientist. Although you may not be creating yourself web apps all the time, most likely there will be someone in the team who has more experience as a web developer, but you should at least know how they are created. It will give you that edge as a data scientist. There are many opportunities where hiring companies expect you to know how to deploy Machine learning solutions to production so that they can cut costs on hiring a web developer. If you add these skills to your resume and also talk about the web apps in the interview you have a higher chance of getting selected.

Also, these skills will give you an edge if tomorrow you want to create a small product for yourself and making money out of it. Chatbots or Question-Answer systems are being widely used these days, and their demands are only going to increase.

In this 2-hour long project-based course, you will learn how to create a Django web app. You will learn how to create forms, models, views, and templates in Django, and how to deploy a machine learning model on a Django app. You will use the Wikipedia API to search for topics.

I would also encourage you to learn how Django communicates with a database through model objects. You should know Object-Relational Mapping (ORM) for database access and how Django models implement this pattern. You can learn about Object-Oriented (OO) pattern in Python. You will learn basic Structured Query Language (SQL) and database modeling, including one-to-many and many-to-many relationships and how they work in both the SQL and Django models. You will learn how to use the Django console and scripts to work with your application objects interactively.

Course on: How to create web apps using Django web-framework.

Topic 3: Sentiment Analysis with Deep Learning using BERT. (Project)

Either it is knowing how a customer feels about a product or finding well in advance whether a customer is going to leave the product a bad review or the customer moves to a competitor, in most of these cases a company would like to know what is the sentiment behind that decision. Companies are more worried about negative sentiments than being euphoric about positive ones. It is a very challenging problem in NLP area. That's where pre-trained models such as Google's BERT come in handy where they have already trained these models in using a vast amount of data including Wikipedia, which I and you can't do it single-handedly. We just have to take these pre-trained models and tweak them as per our requirement, much more economical, easier to maintain, and of course, there is some sort of assurance using such models.

In this 2-hour long project, you will learn how to analyze a dataset for sentiment analysis. You will learn how to read in a PyTorch BERT model and adjust the architecture for multi-class classification. You will learn how to adjust an optimizer and scheduler for ideal training and performance. In fine-tuning this model, you will learn how to design a train and evaluate a loop to monitor model performance as it trains, including saving and loading models. Finally, you will build a Sentiment Analysis model that leverages BERT's large-scale language knowledge.

It is also good to have skill to know one of the deep learning frameworks such as PyTorch although not mandatory. Knowing any one of them is fine. If you're comfortable with Tensorflow, you won't have to become equally good at PyTorch.

Topic 4: Deep Learning NLP: Training GPT from scratch. (Project)

In this 1-hour long project-based course, you will explore Transformer-based Natural Language Processing. Specifically, you will be taking a look at re-training or fine-tuning GPT-2, which is an NLP machine learning model based on the Transformer architecture. You will learn the history of GPT-2 and its development, cover basics about the Transformer architecture, learn what type of training data to use and how to collect it, and finally, perform the fine-tuning process. In the final task, we will discuss use cases and what the future holds for transformer-based NLP. I would encourage learners to do further research and experimentation with the GPT-2 model, as well as other NLP models!

Transformer architecture is becoming increasingly popular and also more efficient in the task NLP performs. Hence it is important to have solid hands-on experience before going for that machine learning interview.

Topic 5: TensorFlow for NLP: Text Embedding and Classification. (Project)

In this 2-hour long project-based course, you will learn the fundamentals of Text Embedding and Text Classification, and you will learn practically how to use text embeddings for a classification task in the real world and create, train, and test a neural network with Tensorflow using texts, and you will get a bonus deep learning exercise implemented with Tensorflow. By the end of this project, you will have learned text embedding and created a neural network with TensorFlow on text classification.

We all know NLP is cool, but what makes it cooler is how to represents words in such a way that a machine can use it for learning.

A word embedding is a learned representation for text where words that have the same meaning have a similar representation. It is this approach to representing words and documents that may be considered one of the key breakthroughs of deep learning on challenging natural language processing problems. In almost every interview you will get asked how to represent words and that's when the focus will be on text embeddings. Basically, without knowing how to represent words, you technically can't do machine learning on texts.

In short, the 5 topics which are commonly asked in interviews these days are related to sentiment analysis, pre-trained models such as BERT or GPT or ELMo etc, text embeddings, chatbots, and attention models. Although you might not face all these 5 topics in a single interview, yes mix and match them.

I hope this article will help you point in the right direction while preparing for the next machine learning interview, esp. if the focus is on NLP.

If you would like to read more of my articles, please follow: www.bikashdebnath.com

要查看或添加评论，请登录

Bikash Debnath ?的更多文章

EP 6: Integrated Realms: Dimensionality, Variables, Probability | Paper 1: A Neural Probabilistic Language Model.

2024年1月31日

EP 6: Integrated Realms: Dimensionality, Variables, Probability | Paper 1: A Neural Probabilistic Language Model.

In continuation to: Paper 1: A Neural Probabilistic Language Model Hello Readers, In this post, we will finally put…
EP 5: Language Modeling | Paper 1: A Neural Probabilistic Language Model

2024年1月30日

EP 5: Language Modeling | Paper 1: A Neural Probabilistic Language Model

In continuation to: Paper 1: A Neural Probabilistic Language Model Hello Readers, What is a Language Model? Answer:…

2 条评论
EP 4: Joint Probability Distribution | Paper 1: A Neural Probabilistic Language Model

2024年1月29日

EP 4: Joint Probability Distribution | Paper 1: A Neural Probabilistic Language Model

In continuation to: Paper 1: A Neural Probabilistic Language Model Hello Readers, Please follow us here:…
EP 3: Random Variables | Paper 1: A Neural Probabilistic Language Model

2024年1月26日

EP 3: Random Variables | Paper 1: A Neural Probabilistic Language Model

In continuation to: Paper 1: A Neural Probabilistic Language Model Hello Readers, Just wanted to share a little…

3 条评论
EP 2: Curse of Dimensionality | Paper 1: A Neural Probabilistic Language Model

2024年1月25日

EP 2: Curse of Dimensionality | Paper 1: A Neural Probabilistic Language Model

In continuation to: Paper 1: A Neural Probabilistic Language Model Hello Readers, I was introduced to the concept of…
EP 1: Paper 1: A Neural Probabilistic Language Model

2024年1月23日

EP 1: Paper 1: A Neural Probabilistic Language Model

Dear Valued Readers, Welcome to my inaugural post, where we embark on a journey through a groundbreaking research paper…

1 条评论
Word Embedding and Word vectors - MathX explained

2023年5月5日

Word Embedding and Word vectors - MathX explained

MathX publish this series to discuss the Mathematics fundamentals behind interesting concepts in the field of…
Disc

2022年7月24日

Disc

My 5-year old son recently inquired as to why Rainbow has only seven colors, not more or less.? I was stumped.

1 条评论
20 Physical Games to engage father child and reduce brain hyperactivity

2022年7月17日

20 Physical Games to engage father child and reduce brain hyperactivity

Technology has an unequal influence on us - both adult and children. It's side-effects vary in size.

2 条评论
8 most in-demand soft-skills for your career

2022年7月16日

8 most in-demand soft-skills for your career

According to a latest report in CNBC, 93% of employers want to see these 8 soft skills on their prospective employee's…

See all articles

5 Topics you will definitely face during Natural Language Processing related interviews.

Bikash Debnath ?

Associate Director for AI | Nuveen

5 Topics you will definitely face during Natural Language Processing related interviews.

Topic 1: Use BERT models for text classification (Project)

Topic 2: Deploy Chatbots using Django web framework (Project)

Topic 3: Sentiment Analysis with Deep Learning using BERT. (Project)

Topic 4: Deep Learning NLP: Training GPT from scratch. (Project)

Topic 5: TensorFlow for NLP: Text Embedding and Classification. (Project)

Bikash Debnath ?的更多文章

社区洞察

其他会员也浏览了

Natural Language Processing Roadmap- Step-by-Step Guide

Byte-Pair Encoding, WordPiece, and Unigram Tokenization

Top 10 NLP Projects for Beginners: Kickstart Your Journey into Natural Language Processing

Natural Language Processing (NLP)

Introduction to Transformer Models

Text Similarity

Unleashing the Power of Large Language Models: A Guide for Beginners

Natural Language Processing _ Part 5

Natural Language Processing in AI: A Machine Comprehension of Human Language.

Natural Language Processing: Bridging the Gap between Human Communication and Computers

5 Topics you will definitely face during Natural Language Processing related interviews.

Topic 1: Use BERT models for text classification (Project)

Topic 2: Deploy Chatbots using Django web framework (Project)

Topic 3: Sentiment Analysis with Deep Learning using BERT. (Project)

Topic 4: Deep Learning NLP: Training GPT from scratch. (Project)

Topic 5: TensorFlow for NLP: Text Embedding and Classification. (Project)

Bikash Debnath ?的更多文章

EP 6: Integrated Realms: Dimensionality, Variables, Probability | Paper 1: A Neural Probabilistic Language Model.

EP 5: Language Modeling | Paper 1: A Neural Probabilistic Language Model

EP 4: Joint Probability Distribution | Paper 1: A Neural Probabilistic Language Model

EP 3: Random Variables | Paper 1: A Neural Probabilistic Language Model

EP 2: Curse of Dimensionality | Paper 1: A Neural Probabilistic Language Model

EP 1: Paper 1: A Neural Probabilistic Language Model

Word Embedding and Word vectors - MathX explained

Disc

20 Physical Games to engage father child and reduce brain hyperactivity

8 most in-demand soft-skills for your career

社区洞察

其他会员也浏览了

Natural Language Processing Roadmap- Step-by-Step Guide

Byte-Pair Encoding, WordPiece, and Unigram Tokenization

Top 10 NLP Projects for Beginners: Kickstart Your Journey into Natural Language Processing

Natural Language Processing (NLP)

Introduction to Transformer Models

Text Similarity

Unleashing the Power of Large Language Models: A Guide for Beginners

Natural Language Processing _ Part 5

Natural Language Processing in AI: A Machine Comprehension of Human Language.

Natural Language Processing: Bridging the Gap between Human Communication and Computers