Exploring Timbot
If you're in Australia's media and marketing world, there's a good chance you're subscribed to TIM BURROWES ' Unmade newsletter. And if so, you may be aware of Timbot - the chatbot we built for Tim at Move 37.
Timbot is a web-based chat interface that leans on two primary sources: Tim's book Media Unmade, and The Unmade newsletter. The goal of Timbot is to utilise this wealth of information to provide users with detailed responses to their questions about Australian media and marketing.
Unlike many AI chatbots, Timbot solely relies on these two resources and does not draw from broader knowledge that's in the GPT training data. So while it gets things right a lot of the time, there's also things it just doesn't know. (For the nerds, we're using OpenAI's GPT-3.5 and GPT-4 models, along with Langchain to do the heavy lifting)
The Numbers
Tokenomics
We use GPT-3.5 for the initial question responses (including book and newsletter reference material) and reasoning, while GPT-4 delivers the final answer. To breakdown the token usage:
All that results in a cost of about $0.004 for GPT-3.5 and $0.03 for GPT-4 per question, so about 3.5c per question.
Like most chatbots, Timbot holds on to message history in a session, allowing for context in follow-on questions and enabling a seamless conversation flow. However, to avoid reaching token limits for context windows we only keep the visible chat interface content and discard the background work leading to the final answer.
The Challenges
Large Language Modelization
One of the main challenges we faced was producing output that reflected an Australian context. Given that the bulk of the training data for LLMs like GPT comprises of U.S English, the spelling, grammar, and overall style tend to tilt in that direction. While Australian (and British) English does exist in the training set, it doesn't explicitly stand out amidst the overwhelming amount of U.S English. Of course when "Australian English" is called out online, it's often as a US-centric caricature of how we speak (or write).
This became evident when our early prompt sets referred to Tim Burrowes as an "Australian media and marketing journalist." This inadvertently triggered the LLM to initiate every response with "G'day" and pepper the conversation with "mates". We managed to correct this by tweaking the prompts, but we still had to include "Australian media and marketing" in prompts in order to make the reasoning step work. In the end there was one final stray "Australian" in the prompts which, when removed, has made Tim almost never say "G'day".
Maintaining Tim's Writing Style
To successfully impersonate Tim's writing style, we had to stop Timbot from resorting to excessive use of exclamation marks, verbosity, or deviating from the first person. These nuances can be hard to overcome, but they're critical when the bot has to recreate a well-known writing style. People always love to "hack" chatbots too, so while it's not instructed to avoid talking like a caveman or singing like a pirate, it has been known to go off track.
Identity Crisis
Interestingly, there were moments when Timbot would misidentify itself as Nigel Marsh instead of Tim Burrowes, despite repeated prompts highlighting that Timbot is designed to mimic Tim Burrowes. This unusual quirk has led us to include one of the most specific and unique prompts we've had yet: "You are not Nigel Marsh".
No Steak Knives
The task of promoting the source materials - the book and the newsletter - required careful balancing. Early on, when prompted to mention the book or the newsletter, Timbot would go into full Glengarry Glen Ross mode. Again it was careful prompting that hit a balance where the bot mentions the sources without over-promoting them.
New models, new Tim
When OpenAI released an updated model of GPT-4, it significantly altered the style of Timbot's output. We had spent a lot of time pushing GPT-4 to write in Tim's style, but suddenly the new model shifted from struggling to capture Tim's tone to exaggerating it. This suggests that the latest training data possibly contained substantial content from Tim (potentially in the form of podcast transcriptions). Again with a bit of prompt tweaking we got Timbot back in line, and it now drops a "toodle pip" at a more reasonable frequency.
Managing Response Times
One technical aspect that presented a challenge was balancing the speed of responses. Users don't like waiting too long for a response, so the reasoning step, which tries to obtain a coherent answer from both the book and the newsletter, was simplified and optimised to use a single API call. This does sometimes result in confusion (particularly when book and newsletter content varies), but cut down response time by half.
For slightly more nerdery, we use JSON data structures for all responses. Despite GPT's impressive abilities, it doesn't always generate well-formatted JSON. When this happens, we have a system in place to check JSON and, if it fails, we actually use GPT-3 to correct the JSON.
Time Travelling
Another challenge was dealing with the concept of time. LLM models generally struggle with understanding the progression of time, and it becomes particularly challenging when providing contextual information from a book and newsletter which doesn't come with timestamps. Without building complex knowledge graphs, it's almost impossible to provide this chronological information from books. As a result, while the real Tim can give you a list of News Corp CEO's over the last two decades without skipping a beat, it's something Timbot struggles with.
The Uncanny Voice Valley
Finally, we decided to experiment with a voice version of Timbot, which has now launched. Voice was always going to be a challenge because latency makes it feel like the world's worst Zoom call. But we got it working fairly well, and hearing Tim's voice instead of reading the text responses is quite a different experience - it suddenly flicks a switch in your head and the authority of what you're hearing suddenly takes on a different gravity. With voice becoming a core feature of the ChatGPT app recently, it
Strategic Digital Transformation Consultant & Program Manager @ Coles Group
1 年Interesting and educational read - thanks Nic!
We're Hiring! - Director at BizWisdom Agency. Co-host of the Brandwidth? podcast.
1 年Awesome article Nic.
Tailored Marketing Strategies, Transparent Results
1 年Excellent work Nic! Thanks for sharing. It is indeed fun! I've been building Jarboot for personal use. I've built in a leadership and motivational coach and I dictate my day. It will create emails, social post if requested etc I'm using JSON files to define me and writing styles, favourite authors styles, philosophies, quotes etc I'm using Whisper so I can just hit a record button and start talking. Using ffmpeg to keep the file size low. I use it everyday.
Host of The Five Of My Life
1 年“You are not Nigel Marsh” ????
Partner at Strat
1 年Thank you Nic Hodges & Tim Burrowes. I’ve enjoyed coming along for this ride