Bridging the Gap: Introducing ZapGPT
ZapGPT emerges as our solution, bridging the gap between the familiar world of WhatsApp and the functionalities of ChatGPT. But how did this idea spark? Let's delve into the problem that ignited our journey.
The Problem
Brazilians have a sincere passion for smartphones. We literally have more smartphones than people in Brazil. We use smartphones for a lot of things (including calling). And among the thousands of mobile applications available, one has a special place in the hearts of Brazilians: WhatsApp.
When we talk about WhatsApp we immediately remember messages and groups, and when we remember groups a figure that is already part of Brazilian pop culture comes to mind: the 'Tia do Zap' (Zap's Aunt)! Yes, she (or he) herself! Our dear relative who is always ready to send a good morning message, a prayer, a chain letter, a scam alert or news of dubious origin in the family group.
Although Zap's Aunt is highly versed in the features of WhatsApp, they often find it difficult to use other applications or services available on the web.
And that's where we started questioning ourselves. Is Zap's Aunt using the new generative models and ChatGPT?
ChatGPT onboarding, for example, is not the easiest, there is some friction, and even understanding how to use the chat can be a barrier for some people. Our hypothesis is that this makes it difficult for people less versed in the digital world to use these technologies, therefore moving Zap's Aunt away from the world of generative models and ChatGPT.
Well, there we have our problem: How to connect Zap's Aunt to LLMs?
Our Solution
ZapGPT was one of the projects developed for the Novatics Hackathon. It aims to solve the problem described above. The idea is to create an application that allows the user to communicate with generative models using WhatsApp.
The objective is to have very straightforward and simple user flow, where the user would only need to add the ZapGPT number on WhatsApp to start using the possibilities offered by the generative/ChatGPT models.
To do this, we built an application that uses the integration between Twilio and OpenAI API to allow users to interact with generative models through WhatsApp, in this case with ChatGPT and Dall-E.
The project was part of a hackathon, so our solution had to be lean. We decided to focus on two features:
How It Works
To implement the two prioritized features, we decided to create a structure of extensions that adds functionalities to generative models and allow the user to obtain the expected results from simple text messages, like the ones we normally send on WhatsApp.
Our goal is to avoid as much as possible the need for structured messages that require the user to see a set of options and respond with the option they want to execute, for example, "Reply with the desired option number: (1) generate image, (2) obtain information, etc…". We think an experience like this is close to a standard web interaction (a dropdown, for example) and our goal was to test how to make ZapGPT work using a completely free text-based interaction.
We defined four extensions: (1) Image generator, (2) GPT default, (3) Fake news check, and (4) Travel itinerary. And for the first MVP we implemented the first two extensions, which meet the prioritized functionalities.
Basically ZapGPT works like this:
In the next section we will explain better how we developed this solution and how we deployed it.
How We Developed It
To develop the solution, three points had to be built: (1) an integration with WhatsApp, (2) a platform to run our API, and (3) an integration with the OpenAI API.
Communication with WhatsApp can be done directly through the API provided by them, however it is necessary to set up a business account which takes some time. To reduce this development friction, we chose to use Twilio, which already supports integration with WhatsApp and provides a free trial account.
To develop our API we use a simple stack with NodeJS, ExpressJS and Typescript. As we had to manipulate images we decided to use the Jimp lib.
We ended-up with the following architecture:
It works as follows:
We deployed the ZapGPT API to an AWS Lambda function. For this we use the serverless-http lib that converts an ExpessJS server to Lambda's serverless format. For a production application with a high volume of requests, this solution may not be the most recommended, however, for the hackathon, it was good enough.
The gpt-3.5-turbo-1106 template is used to identify which extension will be activated to generate the response for the user.
To respond to messages from users who demand image generation, we also used the OpenAI API, more specifically the dall-e-3 model. The use of the OpenAI API in this case also follows the MVP approach, that is, we had little time to achieve a result, so we chose a quick solution with a good cost-benefit ratio.
MVP results
We tested the two features that were prioritized for the MVP: Image generation and question answering (the ChatGPT standard).
In general, the results were good, but some difficulties were encountered when generating images, mainly:
Next Steps
The next step is to test the ZapGPT MVP with a broader group of users to get feedback and define a roadmap of features to be included.
Some other potential technical improvements we can make are:
Senior UX Designer & Project Lead: Bringing innovation for SaaS Teams Around the World
7 个月Esse foi um dos projetos mais massas que a galera criou. Parabéns! A ideia de trazer essa tecnologia t?o recente para as popula??o comum dentro de um ambiente que eles est?o adaptados (Whatspp) ajuda a tornar a IA algo acessível para as massas e possivelmente escalável por entregar valor t?o rápido, já que a curva da aprendizado/adapta??o é modesta.