Build Your Own Text Classification Model From Scratch

Build Your Own Text Classification Model From Scratch

Hey there, tech explorers! Ever wonder how computers can understand our feelings through words? Well, today, we're diving into the cool world of Sentiment Analysis with TensorFlow - a fancy name for teaching computers to know if we're happy or not, excited or a bit bummed out, just by reading what we write!

Setting the Stage: Let's Gather Our Tools

Okay, first things first. We need some tools to make the magic happen. We bring in our computer language called Python and a special helper library called TensorFlow. It's like giving our computer a superhero suit! Then, we grab a bunch of sentences - some happy, some not-so-happy - to teach our computer the difference.

import tensorflow as tf
import numpy as np
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences

# Our happy and not-so-happy sentences
train_sentences = [
    "The new restaurant in town exceeded my expectations.",
    "I was disappointed with the service at the hotel.",
    "The concert last night was amazing!",
    "The traffic on the way to work this morning was unbearable.",
    "I love the atmosphere of this place.",
    "The customer service was excellent.",
    "Yesterday's weather was fantastic.",
    "The book I read last night was boring.",
    "The food at the cafe was delicious.",
    "The flight got delayed, and it was frustrating.",
    "The park is a beautiful place to relax.",
    "The smartphone's battery life is impressive.",
    "The company's customer support needs improvement.",
    "The play at the theater was captivating.",
    "I had a wonderful experience with the tech support team.",
    "The hiking trail offers breathtaking views.",
    "The traffic signal system in the city is inefficient.",
    "The museum exhibits were informative and interesting.",
    "The new software update caused my computer to crash.",
    "The beach vacation was incredibly relaxing."
]

# Labels: 1 for positive, 0 for negative

train_labels = [1, 0, 1, 0, 1, 1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1]        

Building Blocks: Turning Words into Numbers

Computers don't understand words like we do, so we have to turn our sentences into numbers. We do this with something called tokenization and padding. It's like translating our sentences into a language computers understand.

# Tokenizing and padding sequences for training
tokenizer = Tokenizer(oov_token="<OOV>")
tokenizer.fit_on_texts(train_sentences)
word_index = tokenizer.word_index
sequences = tokenizer.texts_to_sequences(train_sentences)
padded_sequences = pad_sequences(sequences)        

Neural Network: Our Computer Brain

Now, we create a simple computer brain, kind of like a tiny robot, to learn from all those numbers. Our robot brain has layers - one to understand words, one to think, and one to decide if it's a happy or sad sentence.

# Simple Neural Network for Text Classification
model = tf.keras.Sequential([
    tf.keras.layers.Embedding(input_dim=len(word_index) + 1, output_dim=16, input_length=padded_sequences.shape[1]),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])        

Let's Train Our Robot: Learning Time!

Just like teaching a pet a new trick, we show our computer brain lots of sentences and tell it if they're happy or not. We do this several times (epochs) until our computer gets really good at guessing feelings.

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Convert labels to NumPy array
train_labels = np.array(train_labels)

# Train the model
model.fit(padded_sequences, train_labels, epochs=10)        

Saving Our Robot's Knowledge: For Later!

We don't want to lose all the hard work, so we save our computer brain's knowledge in a file. It's like putting our robot on pause and telling it, "Remember everything you learned!"

# Save the model
model.save('text_classification_model.h5')        

Testing Time: Let's See How Our Robot Does!

Now, the fun part! We give our robot some new sentences it has never seen before and ask it, "Hey, are these happy or not?" It gives us its best guess.

# Sample comments for prediction
predict_sentences = [
    "I love this product!",
    "The movie was fantastic.",
    "I had a terrible experience with customer service.",
    "The book I read last night was boring."
]

# Tokenize and pad the input sentences for prediction
predict_sequences = tokenizer.texts_to_sequences(predict_sentences)
predict_padded_sequences = pad_sequences(predict_sequences, maxlen=padded_sequences.shape[1])

# Make predictions using the trained model
predictions = model.predict(predict_padded_sequences)

# Convert probability predictions to binary labels (1 for positive, 0 for negative)
binary_predictions = np.round(predictions).astype(int)

# Display the results
for sentence, prediction in zip(predict_sentences, binary_predictions):
    sentiment = "Positive" if prediction == 1 else "Negative"
    print(f"Sentence: '{sentence}' - Predicted Sentiment: {sentiment}")        

Output

Output of the Text Classification Model
Text Classification Model - Output

Conclusion: Cheers to Understanding Feelings!

And there you have it! We just took our computer on a journey to understand feelings through words. Imagine all the cool things we can do with this - like making sure customers are happy or helping machines chat with us better. The world of tech is full of wonders, and we've just scratched the surface. Keep exploring, and who knows what amazing things you might create!



要查看或添加评论,请登录

Vishal Verma的更多文章

社区洞察

其他会员也浏览了