Exploring Timbot
Timbot is actually just code. He doesn't look like this.

Exploring Timbot

If you're in Australia's media and marketing world, there's a good chance you're subscribed to TIM BURROWES ' Unmade newsletter. And if so, you may be aware of Timbot - the chatbot we built for Tim at Move 37.

Timbot is a web-based chat interface that leans on two primary sources: Tim's book Media Unmade, and The Unmade newsletter. The goal of Timbot is to utilise this wealth of information to provide users with detailed responses to their questions about Australian media and marketing.

Unlike many AI chatbots, Timbot solely relies on these two resources and does not draw from broader knowledge that's in the GPT training data. So while it gets things right a lot of the time, there's also things it just doesn't know. (For the nerds, we're using OpenAI's GPT-3.5 and GPT-4 models, along with Langchain to do the heavy lifting)

The Numbers

  • Timbot has responded to 865 questions, resulting in 65,718 words of answers
  • Users so far have spent around 20 hours chatting with Timbot
  • Timbot doesn't just use a simple Q&A process, it actually has a series of reasoning and logic steps, which have added a further 220,000 words of output
  • The database for the Unmade posts and Media Unmade book stores vector information in 1,536 dimensions
  • The codebase for Timbot is relatively lean, consisting of only 279 lines, with another 100 lines of prompts

Tokenomics

We use GPT-3.5 for the initial question responses (including book and newsletter reference material) and reasoning, while GPT-4 delivers the final answer. To breakdown the token usage:

  • The book query prompt and response uses approximately 1,400 tokens
  • The newsletter query prompt and response consumes around 1,000 tokens
  • The reasoning and final output step needs about 1,000 tokens

All that results in a cost of about $0.004 for GPT-3.5 and $0.03 for GPT-4 per question, so about 3.5c per question.

Like most chatbots, Timbot holds on to message history in a session, allowing for context in follow-on questions and enabling a seamless conversation flow. However, to avoid reaching token limits for context windows we only keep the visible chat interface content and discard the background work leading to the final answer.

The Challenges

Large Language Modelization

One of the main challenges we faced was producing output that reflected an Australian context. Given that the bulk of the training data for LLMs like GPT comprises of U.S English, the spelling, grammar, and overall style tend to tilt in that direction. While Australian (and British) English does exist in the training set, it doesn't explicitly stand out amidst the overwhelming amount of U.S English. Of course when "Australian English" is called out online, it's often as a US-centric caricature of how we speak (or write).

This became evident when our early prompt sets referred to Tim Burrowes as an "Australian media and marketing journalist." This inadvertently triggered the LLM to initiate every response with "G'day" and pepper the conversation with "mates". We managed to correct this by tweaking the prompts, but we still had to include "Australian media and marketing" in prompts in order to make the reasoning step work. In the end there was one final stray "Australian" in the prompts which, when removed, has made Tim almost never say "G'day".

Maintaining Tim's Writing Style

To successfully impersonate Tim's writing style, we had to stop Timbot from resorting to excessive use of exclamation marks, verbosity, or deviating from the first person. These nuances can be hard to overcome, but they're critical when the bot has to recreate a well-known writing style. People always love to "hack" chatbots too, so while it's not instructed to avoid talking like a caveman or singing like a pirate, it has been known to go off track.

Identity Crisis

Interestingly, there were moments when Timbot would misidentify itself as Nigel Marsh instead of Tim Burrowes, despite repeated prompts highlighting that Timbot is designed to mimic Tim Burrowes. This unusual quirk has led us to include one of the most specific and unique prompts we've had yet: "You are not Nigel Marsh".

No Steak Knives

The task of promoting the source materials - the book and the newsletter - required careful balancing. Early on, when prompted to mention the book or the newsletter, Timbot would go into full Glengarry Glen Ross mode. Again it was careful prompting that hit a balance where the bot mentions the sources without over-promoting them.

New models, new Tim

When OpenAI released an updated model of GPT-4, it significantly altered the style of Timbot's output. We had spent a lot of time pushing GPT-4 to write in Tim's style, but suddenly the new model shifted from struggling to capture Tim's tone to exaggerating it. This suggests that the latest training data possibly contained substantial content from Tim (potentially in the form of podcast transcriptions). Again with a bit of prompt tweaking we got Timbot back in line, and it now drops a "toodle pip" at a more reasonable frequency.

Managing Response Times

One technical aspect that presented a challenge was balancing the speed of responses. Users don't like waiting too long for a response, so the reasoning step, which tries to obtain a coherent answer from both the book and the newsletter, was simplified and optimised to use a single API call. This does sometimes result in confusion (particularly when book and newsletter content varies), but cut down response time by half.

For slightly more nerdery, we use JSON data structures for all responses. Despite GPT's impressive abilities, it doesn't always generate well-formatted JSON. When this happens, we have a system in place to check JSON and, if it fails, we actually use GPT-3 to correct the JSON.

Time Travelling

Another challenge was dealing with the concept of time. LLM models generally struggle with understanding the progression of time, and it becomes particularly challenging when providing contextual information from a book and newsletter which doesn't come with timestamps. Without building complex knowledge graphs, it's almost impossible to provide this chronological information from books. As a result, while the real Tim can give you a list of News Corp CEO's over the last two decades without skipping a beat, it's something Timbot struggles with.

The Uncanny Voice Valley

Finally, we decided to experiment with a voice version of Timbot, which has now launched. Voice was always going to be a challenge because latency makes it feel like the world's worst Zoom call. But we got it working fairly well, and hearing Tim's voice instead of reading the text responses is quite a different experience - it suddenly flicks a switch in your head and the authority of what you're hearing suddenly takes on a different gravity. With voice becoming a core feature of the ChatGPT app recently, it

Hamish Cameron

Strategic Digital Transformation Consultant & Program Manager @ Coles Group

1 年

Interesting and educational read - thanks Nic!

回复
Sam McEwin

We're Hiring! - Director at BizWisdom Agency. Co-host of the Brandwidth? podcast.

1 年

Awesome article Nic.

回复
Jim Stewart

Tailored Marketing Strategies, Transparent Results

1 年

Excellent work Nic! Thanks for sharing. It is indeed fun! I've been building Jarboot for personal use. I've built in a leadership and motivational coach and I dictate my day. It will create emails, social post if requested etc I'm using JSON files to define me and writing styles, favourite authors styles, philosophies, quotes etc I'm using Whisper so I can just hit a record button and start talking. Using ffmpeg to keep the file size low. I use it everyday.

回复
Nigel Marsh

Host of The Five Of My Life

1 年

“You are not Nigel Marsh” ????

Thank you Nic Hodges & Tim Burrowes. I’ve enjoyed coming along for this ride

回复

要查看或添加评论,请登录