登录查看更多内容

NLP for Generative AI part 2 Neural Network design

Darko Medin

Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.

发布日期: 2023年7月15日

In the previous part it was shown how to preprocess and tokenize the text data for Language processing. This time you will learn how to use TensorFlow to design the Neural Network for processing the text. So lets start ...

First step is is as always loading the required libraries. (Make sure that you have installed the sklearn and tensorflow using the following commands in the command prompt, pip install sklearn and pip install tensorflow).

The second step is to load the Language detection dataset - Language detection.csv and it may be found here : https://www.kaggle.com/datasets/basilb2s/language-detection.

The first and the second are the same as in the previous tutorial and include cleaning the text data and tokenizing it. If you want to go back to the previous tutorial and learn about tokenization of the text data, you may use this link https://www.dhirubhai.net/pulse/developing-llms-generative-ai-tokenization-darko-medin. But in this case since we are performing the training and testing, we need to separate the data into training and testing partitions. We can use the train_test_split() from sklearn.

In this case i added another step which is padding the data and you may see the pad_sequences() function which will make sure that all the sequences are the same length. That's for the input data. The labels are also converted to categorical values and the data is now ready for training.

Now designing the Artificial Neural Network. Instead of using the standard feedforward ANN, we will use the architecture that is good in processing sequential data. There are many layers for this purpose, such as RNN (recurrent neural networks), LSTM (long-short term memory) ANNs, GRU (Gated Recurrent Unit) ANNs and others. In this tutorial, lets design a simple LSTM artificial neural network.

As you can see the Input layers feeds data into the Embedding layer and then the data is processed in 2 LSTM layers with 256 neurons and sent to another Dense() 256 neuron layer before finally sent to the output layer. Keep in mind the output layer needs to have the same number of neurons as the num_classes as this layers outputs will be used for classification. This case num_classes is 17.

领英推荐

Artificial Neural Networks and their applications in…

Dr. Vivek Pandey 1 年前

In search of equivalent of CNNs for wireless…

Subramaniyam Venkata Pooni 2 个月前

Week 8: Deep Dive into Deep Learning and Neural…

Alaaeddin Alweish 6 个月前

Lets compile the model and start the training process!

(note i am using the with tf.device('/GPU') to use my GPU to speed up the training. Training NLP models with a lot of text data may take some time and thats why GPUs are useful to speed up the process. For weaker GPUs this process may take hours. For better ones minutes)

After 130 iterations the model achieved 99.75% train set accuracy and 95.12% vallidation set accuracy. This is quite good result for a language detection model.

I will stop the training at 130 as this is for education purposes and print out the test accuracy using the model.evaluate() as stated in the code above. The test set accuracy is 94.92% which is very similar to the validation accuracy which is also a good indicator in this case...

Ideally i would want the validation accuracy to go over 98 or 99% and in the next tutorial we will see how to use additional feature extraction to improve the accuracy to such levels.

Thanks for reading!

by Darko Medin

Advanced Stats / Data Science

12,681 位关注者

Darko Medin

Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.

1 年

Here is the code on my github : https://github.com/DarkoMedin/Designing-the-LM-Artificial-Neural-Network and here is the link to previous tutorial in this series https://www.dhirubhai.net/pulse/developing-llms-generative-ai-tokenization-darko-medin

要查看或添加评论，请登录

Darko Medin的更多文章

OncoNeo400 - A new Precision Oncology Research AI tool on BioAIWorks

2025年3月16日

OncoNeo400 - A new Precision Oncology Research AI tool on BioAIWorks

In this edition the OncoNeo400, novel Precision Oncology Research AI tool on BioAIWorks platform (bioaiworks.com).

7 条评论
LARVOL CLIN - New modules

2025年3月3日

LARVOL CLIN - New modules

This featuring article is about the new modules Larvol Pseudo-IPD and Larvol NMA on https://clin.larvol.

1 条评论
AI Developer tech skillsets.

2025年2月24日

AI Developer tech skillsets.

While these skills may vary according to the role, i will discuss the most significant ones that almost every AI…

2 条评论
Featuring article - the book : How To Be an Effective Statistician by Dr. Alexander Schacht

2025年2月16日

Featuring article - the book : How To Be an Effective Statistician by Dr. Alexander Schacht

The book How To Be an Effective Statistician: A Guide for Statisticians, Data Scientists, and Other Quantitative…

2 条评论
Causal Inference II Live - The ORIENTATION

2025年2月11日

Causal Inference II Live - The ORIENTATION

Causal Inference II is a Live Linkedin Event by Justin Bélair and Darko Medin . Here is the orientation on how and when…

9 条评论
Simulated and Synthetic Data Generation - Edition 1

2024年10月31日

Simulated and Synthetic Data Generation - Edition 1

The first in the series for Simulated and Synthetic Data Generation - by Darko Medin. Where to read :…
Simulated and Synthetic Data Series by Darko Medin - An ORIENTATION

2024年10月20日

Simulated and Synthetic Data Series by Darko Medin - An ORIENTATION

This is the orientation for my upcoming Series on Simulated and Synthetic Data. If you have any additional suggestions…

5 条评论
Simulated and Synthetic Data Generation - The Effective Statistician Workshop ORIENTATION - Lead by Darko Medin

2024年10月13日

Simulated and Synthetic Data Generation - The Effective Statistician Workshop ORIENTATION - Lead by Darko Medin

In today's data-driven world ability to generate Simulated and Synthetic data is one of the most important Data Science…
INTRODUCTION TO DEEP LEARNING

2024年10月3日

INTRODUCTION TO DEEP LEARNING

The INTRODUCTION TO DEEP LEARNING tutorial. Where to find? adatascience.
BioAIworks - The novel AI platform

2024年9月25日

BioAIworks - The novel AI platform

Bio AI works is a novel AI platform, with main focus on AI Data Generation, Augmenting Biology and Biomedical Research…

8 条评论

See all articles

NLP for Generative AI part 2 Neural Network design

Darko Medin

Data Scientist and a Biostatistician. Developer of ML/AI models. Researcher in the fields of Biology and Clinical Research. Helping companies with Digital products, Artificial intelligence, Machine Learning.

领英推荐

Advanced Stats / Data Science

12,681 位关注者

Darko Medin的更多文章

社区洞察

其他会员也浏览了

Table Parsing Made Simple with Homegrown Neural Networks - Part 3: Building a Neural Network with Semantic & Positional Features

A Primer on Natural Language Processing: Sequence models vs. Attention models

Deep Dive: Building GPT from scratch - part 5

A Comprehensive Guide to Convolutional Neural Networks (CNNs)

The Evolutionary Tale of Language Models: From RNNs to GPT and Beyond

Table Parsing Made Simple with Homegrown Neural Networks - Part 4: Training Pipeline Coding Insights

Exploring Recurrent Neural Networks (RNN)

Exploring Deep Learning with Neural Networks at the AI for Good Institute

AI Atlas #17: Recurrent Neural Networks (RNNs)

Demystifying Multihead Attention in the Transformer Neural Network Architecture – With Code

领英推荐

Advanced Stats / Data Science

12,681 位关注者

Darko Medin的更多文章

OncoNeo400 - A new Precision Oncology Research AI tool on BioAIWorks

LARVOL CLIN - New modules

AI Developer tech skillsets.

Featuring article - the book : How To Be an Effective Statistician by Dr. Alexander Schacht

Causal Inference II Live - The ORIENTATION

Simulated and Synthetic Data Generation - Edition 1

Simulated and Synthetic Data Series by Darko Medin - An ORIENTATION

Simulated and Synthetic Data Generation - The Effective Statistician Workshop ORIENTATION - Lead by Darko Medin

INTRODUCTION TO DEEP LEARNING

BioAIworks - The novel AI platform

社区洞察

其他会员也浏览了

Table Parsing Made Simple with Homegrown Neural Networks - Part 3: Building a Neural Network with Semantic & Positional Features

A Primer on Natural Language Processing: Sequence models vs. Attention models

Deep Dive: Building GPT from scratch - part 5

A Comprehensive Guide to Convolutional Neural Networks (CNNs)

The Evolutionary Tale of Language Models: From RNNs to GPT and Beyond

Table Parsing Made Simple with Homegrown Neural Networks - Part 4: Training Pipeline Coding Insights

Exploring Recurrent Neural Networks (RNN)

Exploring Deep Learning with Neural Networks at the AI for Good Institute

AI Atlas #17: Recurrent Neural Networks (RNNs)

Demystifying Multihead Attention in the Transformer Neural Network Architecture – With Code