Email Spam Detection using Pre-Trained BERT Model : Part 1 - Introduction and Tokenization

Recently I have been looking into Transformer based machine learning models for natural language tasks. The field of NLP has changed tremendously in the last few years and I have been fascinated by the new architectures and tools that they are coming out at the same time. Transformer models are one of such architecture.

As the frameworks and tools to build transformer models keep evolving, the documentation often becomes stale and blog posts are often confusing. So for any one topic, you may find multiple approaches which can confuse beginners.

So as I am learning these models, I am planning to document the steps to do few of the important tasks in simplest way possible. This should help any beginner like me to pickup transformer models.

In this two-part series, I will be discussing how to train a simple model for email spam classification using pre-trained transformer BERT model.This is the first post in series where I will be discussing about transformer models and preparing our data. You can read all the posts in the series?here.

Transformer Models

Transformer is a neural network architecture first introduced by Google in 2017. This architecture has proven extremely efficient in learning various tasks. Some of the popular models of transformer architecture is BERT, Distilbert, GPT-3, chatGPT etc.

You can read more about transformer models in below link

https://huggingface.co/course/chapter1/4.

Pre-Trained Language Model and Transfer Learning

A pre-trained language model is a transformer model, which is trained on large amount of language data for specific tasks.

The idea behind using pre-trained model is that, model has really good understand of language which we can borrow for our nlp task as it is and just focus on training unique part of task in our model. This is called as transfer learning. You can read more about transfer learning in below link

https://huggingface.co/course/chapter1/4#transfer-learning.

Google Colab

Google Colab is a hosted jupyter python notebook which has access GPU runtime. As these transformer models perform extremely well on GPU, we are going to use google colab for our examples. You can get community version of same by signing in using your google credentials.



要查看或添加评论,请登录

madhukara phatak的更多文章

社区洞察

其他会员也浏览了