NLP chatbot for healthcare industry: Case Pharma
In this article we are going to create an intelligent chatbot which adapts the answers according to the inputs of the users, we are going to take as an example a pharmaceutical industry in which I worked and where I had started the project beforehand " Laboratoires SALEM "
What is an NLP (Natural Language Processing) chatbot is an artificial intelligence technology that enables computers to understand, interpret, and respond to human language. An NLP chatbot uses advanced algorithms and machine learning techniques to analyze user input, identify the user's intent, and provide relevant responses in natural language.
1. Let's start building the bot
1.1 User intents
Before starting to build the bot we need first to have an idea about user intents, in the example of the glucometers we can use social media data to have an idea about it, or analyzing keywords search related to the product with Google Planner and other tools to have an idea on what people are looking for.
In the Check 3 case we have:
We know from this data that it is necessary to create answers to questions related to the strips, the price, the bracelet supplied with certain glucometers, the product itself, the error codes, the reliability as well as the manual !
You can also use the data you have from medical promotion teams who generally collect this data from the doctor or pharmacist!
If you don't have data you can generalize the process and create intent related to:
According to the data I had from social media, the web and medical promotion, user intents can be classified as follows:
Each tag must include its "patterns" (user inputs) as well as the "responses". Then you must group everything in a file in json format.This is what the file would look like :
{"intents" : [
{"tag": "Greetings",
"patterns": ["hello","hi","salem alikom","hey","bonjour","bonsoir","greetings","salut","lu","slt"],
"responses": ["Hello,What can I do for you ?", "Hi, How can I help you ?", "Greetings, What can I do for you ?" ]
},
{"tag": "Goodbye",
"patterns": ["bye","take care","later","see you later","have a good day"],
"responses": ["Talk to you later !", "Happy to help you !", "Goodbye !" ]
},
{"tag": "Name",
"patterns": ["what is your name","your name","how should I call you ?","your name please","who are you ?"],
"responses": ["You can call me Ahmed, I am your virtual assistant", "Ahmed ! at your service !"]
},
{"tag": "Use",
"patterns": ["how can i use my glucometer","use of my glucometer","i don't know how to use my glucometer","how to use glucometer to check blood sugar","how to use glucometer strips","how to use glucometer in hospital","how to use glucometer check 3"],
"responses": ["We have a video explaining how to use it, you can check it on our Youtube channel or with the following link: https://www.youtube.com/watch?v=THj_ACc3myg&t=5s&ab_channel=LaboratoiresSalem in French and https:// www.youtube.com/watch?v=pNC23lIMuq8&t=92s&ab_channel=LaboratoriesSalem in Arabic", "You can find the videos the followink link: https://www.youtube.com/watch?v=THj_ACc3myg&t=5s&ab_channel=LaboratoiresSalem in French and https://www.youtube.com/watch?v=pNC23lIMuq8&t=92s&ab_channel=LaboratoiresSalem in Arabic"]
},
{"tag": "Know",
"patterns": ["who you are","who is salem laboratory","what is laboratoires SALEM","laboratoires salem"],
"responses": ["Laboratoires Salem is a family business born in 1994. It invests in the private pharmaceutical industry in Algeria, the company has several headquarters in Cheraga, Sétif and El Eulma", "We are a national pharmaceutical industry specialized in the production of medicine, food supplements and medical equipment, more information about ours: https://labosalem.dz/notre-histoire/ "]
},
{"tag": "Open",
"patterns": ["Are you open?","at what time do you open"],
"responses": ["We work from Sunday to Thursday from 8 a.m. to 4 p.m., during Ramadan we are open from 9 a.m. to 4 p.m.!", "We're open from 8 a.m. to 4 p.m. from Sunday through Thursday, and from 9 a.m. to 4 p.m. during Ramadan!"]
},
{"tag": "Job",
"patterns": ["I am looking for an internship","you recruit ?","do you have job offers?","I have a master's degree in chemistry and I am looking for a job","Are you looking for medical representatives?"],
"responses": ["You can find all our job offers on our website ! they are available at the following link: https://labosalem.dz/emploi/ ", "All of our employment openings are listed on our website. They may be found at the following link: https://labosalem.dz/emploi/ "]
},
{"tag": "Prob",
"patterns": ["My glucometer does not work", "I have problems with my glucometer", "My glucometer is faulty", "My glucometer screen does not work", "My glucometer does not read my test strips", "My glucometer is broken"],
"responses": ["In case of malfunction you can contact us directly on the following page: https://labosalem.dz/pharmacovigilance/","On the following page, you can contact us directly in the event of a malfunction: https://labosalem.dz/pharmacovigilance/ "]
},
{"tag": "Buy",
"patterns": ["I would like to buy a glucometer", "where can I find strips", "is your product available?"],
"responses": ["Yes, our product is available! you will find it in your nearest pharmacy","Sure, you may purchase our products! It may be found in your neighborhood drugstore."]
},
{"tag": "Price",
"patterns": ["The price of the strips","The price of the glucometer","The price of Check 3","I would like to know the price of the strips and the glucometer"],
"responses": ["Our Check3 glucometer is provided free of charge for diabetics in pharmacies, concerning the strips of 50, they are available at the price of 1500 Dzd, refundable for diabetics with a Chifa card"]
}
]}
1.2. Building the model
1.2.1. Dependencies
In this section we will start importing the necessary libraries so that we can build our model which will then be used by the NLP Chat bot.
We will use random in order to shuffle the training data later, randomizing the order of the data can help prevent the model from learning patterns in the data that are specific to the order of the data, which can lead to overfitting.
We will also import json for reading the json file, numpy to transform the training data into a numpy array.
Natural Language Toolkit (NLTK) ; it provides a set of tools and resources for tasks such as tokenization, part-of-speech tagging, parsing, and sentiment analysis. We will use it for tokenization and lemmatization.
Pickle to save the models and some lists that we will create later.
Finally, we will import tensorflow for building and training the model.
import random
import json
import pickle?
import numpy as np
import nltk
from nltk.stem import WordNetLemmatizer?
import tensorflow as tf
1.2.2. tokenization and lemmatization
But wait what is tokenization and lemmatization ? ??????
Tokenization
Tokenization is the process of breaking down a piece of text into smaller units, called tokens. These tokens can be words, phrases, or other meaningful units, depending on the specific task or application.
Lemmatization
Known as a lemma is the process of reducing words to their base.This is done by analyzing the context in which the word appears in a sentence and identifying the root form of the word.
For example, the lemma of the words "running," "ran," and "runs" is "run."
1.2.3. Preprocessing
We will start by creating a lemmatizer to lemetize each word in the intents.json, then import our data (the json file)
lemmatizer = WordNetLemmatizer()
intents = json.loads(open('intents.json').read())
Then we will create?3 empty lists:
One of the vocabulary which is made up of all the tokens
Classes which include the different classes
Finally, document which is the combination (classes and words of the class)
We will also create a list of ignore letters (letters that will not be considered)
words = []
classes = []
documents = []
ignoreLetters = ['?', '!', '.', ',']
We will start building our vocabulary by tokenizing each element in the patterns from the json file using nltk.word_tokenize().
We will save the result in wordList and appends the resulting list of words to words.
A tuple of the tokenized words and the intent's tag is added to the documents list.
If the intent's tag is not already in the classes list, then it will be added.
for intent in intents['intents']:
? ? for pattern in intent['patterns']:
? ? ? ? wordList = nltk.word_tokenize(pattern)
? ? ? ? words.extend(wordList)
? ? ? ? documents.append((wordList, intent['tag']))
? ? ? ? if intent['tag'] not in classes:
? ? ? ? ? ? classes.append(intent['tag'])
This is how our documents look like, we have each token with its tag.
Now we will apply lemmatization for each word in the list words, then sort the words list in alphabetical order and remove any duplicates
words = [lemmatizer.lemmatize(word) for word in words if word not in ignoreLetters]
words = sorted(set(words))
We will save our words and classes lists for later by using Pickle
pickle.dump(words, open('words.pkl', 'wb'))
pickle.dump(classes, open('classes.pkl', 'wb'))
1.2.4. BoW
For now we have only text but not numerical values, and in order to build a model we need to transform our text into numerical values.To solve this problem we will use BoW technique
What is BoW : The technique represents a piece of text as a bag (multiset) of its words, disregarding grammar and word order but keeping track of word frequency. In order to perform this technique, it is necessary to follow the following steps
领英推荐
Let's take an example to understand ! Suppose we have the following two sentences:
The BoW representation of these sentences might look like this:
? ? ? ? ? ?tiger jumps over fence hog runs forest
Sentence 1:? 1? ? ?1? ? ?1? ?1? ? ?0? ? 0? ?0? ? ?
Sentence 2:? 0? ? ?0? ? ?0? ?0? ? ?1? ? 1? ?1? ?? ?
In this representation, each row represents a sentence, and each column represents a word in the vocabulary. The value in each cell corresponds to the count of the corresponding word in the sentence.
Let's get back to our chatbot! Before creating a function for BoW, we will create 2 empty lists, one for training and the other for empty output which contains many zeros as the length of the classes.
We will iterate over a list of documents. For each document, an empty list called bag is created. The first element of the document (list of words) is retrieved and stored in word patterns. Each word in word patterns is then lemmatized and converted to lowercase using the lemmatizer object, and the resulting list is assigned back to word patterns.
The second loop iterates over a list of words. For each word, the code checks if it appears in word patterns. If it does, a 1 is appended to bag. If it doesn't, a 0 is appended to bag (same example that we saw for the 2 sentences). This creates a binary bag of words representation of the document, where each element in bag corresponds to a word in words, and the value of each element indicates whether the word appears in the document.
OutputRow is a one-hot encoded representation of the current document's tag.( OutputEmpty is a list of 0's with a length equal to the number of unique tags in the dataset, and classes is a list of all the unique tags. The code finds the index of the current document's tag in the classes list, and sets the corresponding element of outputRow to 1.)
Then we will add the current bag of words and output row to the training data as a single element.
for document in documents:
? ? bag = []
? ? wordPatterns = document[0]
? ? wordPatterns = [lemmatizer.lemmatize(word.lower()) for word in wordPatterns]
? ? for word in words:
? ? ? ? bag.append(1) if word in wordPatterns else bag.append(0)
? ? outputRow = list(outputEmpty)
? ? outputRow[classes.index(document[1])] = 1
? ? training.append(bag + outputRow)
1.2.5. Shuffle the data and definition of X and y
In this section we will shuffle and split the training data into input and output variables 'x_train' and 'y_train'
We start by shuffling the data. Randomizing the order of the data can help prevent the model from learning patterns in the data that are specific to the order of the data, which can lead to overfitting. By shuffling the data randomly, the model is forced to learn general patterns in the data, rather than relying on specific patterns that may be unique to the order of the data.
Before defining X and Y we need to transform the training list into an numpy array, which makes it easier to index and manipulate the data.
X is everything in the first dimension of the training list (the features) and Y is everything in the second dimension of the training list (the labels)
random.shuffle(training)
training = np.array(training)
train_x = list(training[:,0])
train_y = list(training[:,1])
1.2.6. Building the ANN
We will create a sequential model allowing us to create models layer by layer.Then we add a Dense layer to the model with 3000 neurons, the input shape is the length of the train_x, for the activation function we will use ReLU which is a popular activation function that helps the model to learn nonlinear relationships in the data.
The Dropout layer randomly drops out a fraction of the neurons in the layer during training, which helps to prevent overfitting (0.5 means 50% of the neurons).
Then we create the second layer with 1500 neurons, same activation function and same Dropout parameter.
Finally, the last layer , this layer has a number of neurons equal to the number of output classes in the training data. The activation function for this layer is 'softmax', which produces a probability distribution over the output classes.
When we finish the initial configuration, we will compile the model, for loss function we will choose categorical cross-entropy commonly used for multiclass classification problems where there are more than two classes., the optimizer is Adam, and the metrics are accurate.
Now we will train the model and feed it with the X and Y values, for 50 epochs and a batch size of 5 (batch size refers to the number of training examples used in one forward/backward pass of the neural network during the training process).
Once the training is finalized, we will save the model in HDF5 format
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(3000, input_shape=(len(train_x[0]),), activation = 'relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(1500, activation = 'relu'))
model.add(tf.keras.layers.Dropout(0.5))
model.add(tf.keras.layers.Dense(len(train_y[0]), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(train_x, train_y, epochs=50, batch_size=5)
model.save('check3_model.h5')
1.3. Building the chatbot
1.3.1. Dependencies
In this section we will import the modules necessary for the creation of the chatbot :
import random
import json
import pickle
import numpy as np
import nltk
from nltk.stem import WordNetLemmatizer
import tensorflow
from tensorflow import keras
from keras.models import load_model
1.3.2. Lemmatizer and Load the data
We will create an object for the lemmatizer, because we will then create a function that extracts the tokens from the user inputs, applies the lemmatization and then transforms the result into a vector using BoW, let's start by importing the data, the list of words (our vocabulary) and the classes created before and the model.
lemmatizer = WordNetLemmatizer()
intents = json.loads(open('intents.json').read())
words = pickle.load(open('words.pkl', 'rb'))
classes = pickle.load(open('classes.pkl', 'rb'))
model = load_model("check3.model")
1.3.3. Cleaning function
The function tokenizes the user's message into individual words, lemmatizes them, and returns a list of the cleaned words.
def clean_sentence(sentence):
sentence_words = nltk.word_tokenize(sentence)
sentence_words = [lemmatizer.lemmatize(word) for word in sentence_words]
return sentence_words
1.3.4. BoW function
The next function takes the cleaned sentence and creates a bag of words, which is a list of 0's and 1's representing the presence or absence of each word in the sentence compared with what is present in the vocabulary (words). This function creates a numpy array of this list.
def bag_of_words(sentence):
? ? sentence_words = clean_sentence(sentence) # a bag of 0 according to the length of the words (list)
? ? bag = [0] * len(words)
? ? for w in sentence_words:
? ? ? ? for i, word in enumerate(words):
? ? ? ? ? ? if word == w:
? ? ? ? ? ? ? ? bag[i] =?1
? ? return np.array(bag)
1.3.5. Predict function
The function takes the bag of words and predicts the class (or intent) of the user's message using the loaded model. The function returns a list of intents and their probabilities, sorted by probability.
The results list will contain only those predictions where the confidence score is above the specified threshold.
def predict_class(sentence):
bow = bag_of_words(sentence)
res = model.predict(np.array([bow]))[0]
error_threshold = 0.20
results = [[i,r] for i,r in enumerate(res) if r >error_threshold]
1.3.6. Response Function
The function takes the list of intents and the intents JSON file and generates a response based on the predicted intent. It chooses a random response from the list of responses associated with the predicted intent.
def get_response(intents_list, intents_json):
tag = intents_list[0]["intent"]
list_of_intents = intents_json['intents']
for i in list_of_intents:
if i['tag']==tag:
result = random.choice(i['responses'])
break
return result
while True:
message = input('')
ints = predict_class(message)
res = get_response(ints, intents)
print(res)
Finally, the code enters an infinite loop where it takes user input, predicts the intent using the predict_class function, generates a response using the get_response function, and prints the response to the console.
2. Testing the bot !
You can see that the bot works well, you can always optimize it with the data from social media, medical reps, website and so on.
For the use, you can add it to your website, you social media platforms, your web app, mobile app, your device...
This is the end of this article, we have explored in detail how to create an intelligent bot that will answer for you and animate your page 24/24, 7/7 ??
It allows you to always be in contact with your customers and provide medical/pharmaceutical or marketing information to your customers all the time!
Are you bot oriented? Or do you always prefer to keep a real person in moderation? or maybe the hybrid between the real and the automated?
Co-Founder & CTO @Wessini - Legal Tech ?? For Human Needs ?? | Deep Learning Specialist ??| Legal Generative AI | Legal Action Model??|
1 年I tested another approach with Company Documents From PDF to MDX ! We can also answer specific questions of the industry combining : - LLM's ( MultiModal One ) - Vectorized Documents - Embedded search with neural network - QnA supplied & Queried by LLM's - Scoring - Relevance match For example : Question => GPT-4 => Langchain => PGvector or Pinecone => GPT-4 => Answer