How to Build Affordable & Scalable AI Chatbots on WhatsApp
James Harrison on Unsplash

How to Build Affordable & Scalable AI Chatbots on WhatsApp

The AI space has been inundated with a plethora of new no-code tools. While some of these tools have proven useful, many of them are quite pricey, ESPECIALLY if you plan to handle large volumes through your app.

So, what is the best alternative?

Well, it’s a hard truth, and you might not like it.

Are you ready?

Remove the word ‘no’ from ‘no-code,’ and you get your answer.

Not a surprise for most people, I bet. Creating a custom-coded solution can be the most scalable, flexible, and cost-effective way to go (in my experience, anyway).

Not a surprise for most people, I bet. Creating a custom-coded solution can be the most scalable, flexible, and cost-effective way to go (in my experience, anyway).

To show you just how powerful a custom-coded solution can be, let me show you an example of a finished chatbot I made that handles inquiries from driving school students within seconds, in a friendly and conversational way.

With the right guidance and knowledge, it’s not much harder than using a no-code solution, especially with the help of GPT-4o or the recently released ‘Codestral’ by Mixtral.

Want to know the best part? Creating our own solution also comes with many positive side effects, such as:

  • Easily swapping between LLMs (especially if using LangChain or LlamaIndex) without being restricted by the options given by the no-code platform.
  • Choosing what providers to use for cloud hosting, messaging, etc. You can even host the app yourself!
  • Full control over additional integrations (e.g., I developed a full admin dashboard integrated with WhatsApp to manage all users).
  • Learning more about building the solution itself.
  • And, of course, lower costs.

Without further ado, let’s get started!. This is my first article ever, so I hope I’m not doing too bad?:)

By the way, if you need help with this or other AI solutions and automations, or if you just want to chat, feel free to shoot me a DM

First of all, you need to have a clear understanding of how the solution actually works. It can be divided into three parts:

  1. Input ?? ? Have a clear idea of what the inputs will look like (both in terms of content and format: text, audio, video, image).
  2. Output ?? ? Have a clear idea of what you want the output to look like, with some examples ready (these will come in handy later).
  3. Framework ?? ? The most crucial part, which includes: Prompt & Data/Knowledge Base

Now, in order to build it, we need to cover:

  • LLM
  • Knowledge base
  • Interface
  • Hosting
  • Database (optional)


LLM

Choosing the right large language model (LLM) depends on your specific needs. This choice is unique to the solution you’re planning to build, the expected inputs, and the desired outputs. To simplify the decision, here’s a matrix that may help you:

Can’t be bothered thinking about which LLM to use?

Use GPT-4o.

Given the costs associated with using WhatsApp as an interface, I’d choose GPT-4o for 99% of cases to ensure each message is worth it. If you are feeling a bit edgy today, you may even consider using the recently released Claude 3.5 Sonnet.

You can check out the different pricings here:

You just need to create an account wherever you prefer and get your API key ready. If you don’t know how, check this out.


Knowledgebase (KB)

For our example, we will be using Qdrant. Their free tier is honestly quite good, but Pinecone is also a great option.

Getting the right data for the knowledge base is crucial. It must be high quality and try to avoid repeating much information. Here’s what you need to do:

  1. Create an account on Qdrant and get your API keys.
  2. Gather the data.
  3. Split the data into chunks with a set overlap.
  4. Upload it to the KB.

Creating an effective Knowledge Base is a significant part of developing a good solution, and there are many nuances involved. I plan to write a full article solely about creating a good KB based on my experience.

Here’s the full code on how to upload your data to your vector store with Qdrant:

from langchain.chains import RetrievalQA
from langchain.llms import OpenAI
import os
import openai
from dotenv import load_dotenv
from langchain.vectorstores import Qdrant
from langchain.embeddings.openai import OpenAIEmbeddings
import qdrant_client
from langchain.text_splitter import CharacterTextSplitter
from langchain.chat_models import ChatOpenAI
from qdrant_client.http.models import Batch

# We get the absolute path to the directory where the current script resides & Change current working directory
script_directory = os.path.abspath(os.path.dirname(__file__))

os.chdir(script_directory)

# Load environment variables from .env file
load_dotenv()

# Set OpenAI API key
openai.api_key = os.getenv("OPENAI_API_KEY")
embeddings = OpenAIEmbeddings()

# Initialize Qdrant client
client = qdrant_client.QdrantClient(
    os.getenv("QDRANT_HOST"),
    api_key=os.getenv("QDRANT_API_KEY")
)

print('Collection: ', os.getenv("QDRANT_COLLECTION_NAME"))

# Prompt user to recreate collection
recreate = input("Do you want to create a new collection? (Y/N) ").lower()

if recreate == "y":
    print("Okay! Creating new collection...")

    client.recreate_collection(
        collection_name=os.getenv("QDRANT_COLLECTION_NAME"), 
        vectors_config=qdrant_client.http.models.VectorParams(
            size=1536,
            distance=qdrant_client.http.models.Distance.COSINE
        )
    )

    vector_store = Qdrant(
        client=client, 
        collection_name=os.getenv("QDRANT_COLLECTION_NAME"), 
        embeddings=embeddings
    )

elif recreate == "n":
    print("Alrightymighty")
    vector_store = Qdrant(
        client=client, 
        collection_name=os.getenv("QDRANT_COLLECTION_NAME"), 
        embeddings=embeddings
    )

vectors_config = qdrant_client.http.models.VectorParams(
    size=1536,
    distance=qdrant_client.http.models.Distance.COSINE
)

# Function to get chunks of text
def get_chunks(text, chunk_size=777, chunk_overlap=120):
    text_splitter = CharacterTextSplitter(
        separator="\n",
        chunk_size=chunk_size,
        chunk_overlap=chunk_overlap,
        length_function=len
    )
    chunks = text_splitter.split_text(text)
    return chunks

# Read and chunk the text file
with open("transcriptions.txt", encoding="utf-8") as f:
    raw_text = f.read()

texts = get_chunks(raw_text)

# Upload chunks to vector store
for text in texts:
    metadata = {"Topic 1": "XYZ"}
    vector_store.add_texts([text], [metadata])  # Pass list of texts and list of metadata

print("All transcripts have been uploaded to the vector store!")        

You just need to:

  1. Name the data file transcriptions.txt and place it along with the?.py file.
  2. Set up your environment variables with your OpenAI & Qdrant API keys, along with the collection name for the knowledge base. For this step, create a file in the same directory as your?.py file. Name it py.env, and paste the following information into it:

QDRANT_HOST = enter_host_url
QDRANT_API_KEY = enter_qdrant_api
QDRANT_COLLECTION_NAME = Test

OPENAI_API_KEY = enter_openai_api        

and that’s it!


Interface

Since we will be using WhatsApp, our best two options are:

  1. Using Twilio: Easier to set up, but charges a small fee per message.
  2. Using the official WhatsApp Business API: Requires getting your business verified.

Understanding Pricing

WhatsApp Business API charges based on the number of conversations. Each conversation is a 24-hour window during which both the user and business can exchange messages. Conversation prices vary by region and conversation type (e.g., utility, marketing, verification).

With both options you get 1000 free conversations per month.

For our example we’ll be using Twilio.

You can check out their pricing on here:

And click here for an in-depth explanation of how the pricing model works.

As for the setup process itself, you’ll need to create your Twilio account and top up $20 in order to be able to use the APIs.

After doing so, you’ll need to get the phone number you want to use as a WhatsApp contact setup. You may choose to either purchase a number from Twilio, or use one of your own phone numbers.

And here is yet another guide on how to do so (I thoroughly apologize for all the guide-dumping that I’m doing on here but, if I didn’t, this article would be waaay too long).


Hosting

Just where you host your app, for this example we’ll be using Heroku but you can use others like: AWS, Google Cloud, Azure, Vercel…

To complete this step go to heroku.com → create an account.

Then add your payment information and create a new app.

And that’s it!

In case you want a guide that’s a bit more detailed, or you’d like to do it through the Heroku CLI, check out this article.


Database (optional)

Just in case you wanna save message logs, or have some other custom functionality that requires a DB such as only allow certain users to message, otherwise it’s not needed.

Again, it’s up to preferences. In my personal case I used the PostgreSQL add-on on Heroku. But you can use AirTable, MongoDB or whichever other option you prefer.


Now… Putting it all?together

Update your py.env file to include:

QDRANT_HOST = enter_host_url
QDRANT_API_KEY = enter_qdrant_api
QDRANT_COLLECTION_NAME = Test

OPENAI_API_KEY = enter_openai_api

T_SID = enter_twilio_SID
T_AUTH = enter_twilio_AUTH        

Now we can use some sample code such as:

import os
import openai
from dotenv import load_dotenv
from flask import Flask, request, Response
from twilio.rest import Client
from langchain.vectorstores import Qdrant
from langchain.embeddings.openai import OpenAIEmbeddings
import qdrant_client

# Load environment variables
load_dotenv()

# Set OpenAI API key
openai.api_key = os.getenv("OPENAI_API_KEY")
embeddings = OpenAIEmbeddings(openai_api_key=openai.api_key)

# Initialize Qdrant client
client = qdrant_client.QdrantClient(
    url=os.getenv("QDRANT_HOST"),
    api_key=os.getenv("QDRANT_API_KEY")
)

vector_store = Qdrant(
    client=client,
    collection_name=os.getenv("QDRANT_COLLECTION_NAME"),
    embeddings=embeddings,
)

# Initialize Twilio client
twilio_client = Client(os.getenv("T_SID"), os.getenv("T_AUTH"))

app = Flask(__name__)

def chath(query):
    docs = vector_store.similarity_search_with_score(query=query, k=4)
    contexts = ['"{}"'.format(doc.page_content) for doc, _ in docs]
    context_str = ', '.join(contexts)

    full_response = openai.chat.completions.create(
        model='gpt-4o',
        temperature=0,
        messages=[
            {"role": "system", "content": "I'm an expert in XYZ with a knack for solving ABC problems."},
            {"role": "user", "content": f"Q: {query}"},
            {"role": "assistant", "content": "Here are some additional context that might help with my answer: " + context_str},
            {"role": "assistant", "content": "A: "}
        ]
    )

    response = full_response.choices[0].message.content
    return response

@app.route("/webhook", methods=['POST'])
def webhook():
    sender_number = request.form['From']
    message_body = request.form['Body']

    response_message = chath(message_body)

    send_twilio_response(sender_number, response_message)
    return Response(status=200)

def send_twilio_response(to_number, message):
    max_length = 1600
    message_chunks = [message[i:i + max_length] for i in range(0, len(message), max_length)]
    for chunk in message_chunks:
        twilio_client.messages.create(
            body=chunk,
            from_='whatsapp:+43609586712',
            to=to_number
        )

if __name__ == "__main__":
    app.run(debug=True)        

How the code works:

  1. We set up our environment, initialize our clients, set up our vector store & Flask app.
  2. When we receive a message, it’s passed on from Twilio to our Heroku endpoint (in this case called ‘/webhook’).
  3. We then process the message by:

  • Getting the 4 most relevant vectors to use as context.
  • Sending the user query along with the context to GPT-4o to craft our response.
  • Sending the response back to the Twilio API.

Full disclosure, this is just some simple functionality to get you started.

But now we are almost done!

After adding the previous code to a file you may call ‘main.py’, in that same directory create a ‘requirements.txt’. In there, we need to include all the libraries that our app needs, something like:

async-timeout==4.0.3
openai==1.10.0
python-dotenv==1.0.0
qdrant-client==1.6.0
Flask==3.0.1
twilio==8.10.1
gunicorn
langchain==0.0.335
tiktoken==0.5.1
datetime        

Now, inside your Heroku app, click on ‘Open app’ in the top right, and copy the URL. Then head over to Twilio, go to ‘WhatsApp senders’ or click here. Click on your WhatsApp number at the bottom, select ‘Use webhooks’, and inside the ‘Webhook URL for incoming messages’ paste your app’s URL + /webhook. It should look something like https://your-app-0do12y32qr17.herokuapp.com/webhook.

We’ve just connected Twilio with our app!

Now we just need to upload our app to Heroku. For this purpose, we need to have:

  • A GitHub account & Git installed locally.
  • Heroku CLI installed locally.

Then we can just open up our console and paste:

git add main.py requirements.txt
git commit -m "First upload"
git push origin master
git push heroku master        

And, if everything goes according to plan, your app should get uploaded to your GitHub account and then pushed to your Heroku account, so it now works 24/7.

After a bit of tinkering you can get something like what was shown on the video example at the beginning of the article.


And that’s?all!

I hope you learned something from this article and found the explanation (somewhat) clear and helpful.

To be fair, it’s quite a complex process, and summing it up is no easy task either.

But hey, if you enjoyed this article and want to discuss more about AI solutions and automation, or if you’ve got any questions…

Feel free to reach out on LinkedIn.

Hope you have an amazing rest of your day :)

Ricardo Alberto Martinez

Degree in Business management and technology. Estudiante en la Carlos III

2 个月

Máquina

Michele Torti

Helping SMB’s save time and increase conversions through AI Automation | Founder @ JM Solutions

2 个月

Great article man. Loved the step by step and how practical you took it. Personally prefer to use no-code tools just because of how beginner friendly it is but totally agree that coding could step up your game and make everything more customisable at a lower cost. Keep crushing it??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了