Bard vs ChatGPT4: The AI-space race
Credit: NurPhoto via Getty Images

Bard vs ChatGPT4: The AI-space race

Google Bard was released yesterday and the internet went berserk trying to compare it with OpenAI's ChatGPT4. (Here are articles on Mashable, The Verge, WSJ, and a succinct summary by Forbes.)

I also got access to Bard and decided to test and compare it with ChatGPT4. I asked both a bunch of questions, from banal to contentious, and the responses were very interesting.

Test 1: Restating a fact as a Fictional Character

No alt text provided for this image
No alt text provided for this image

In a fascinating interview with ABC this week, Sam Altman, the CEO of OpenAI mentioned that they waited for 7 months to release ChatGPT4, because they were trying to make it more safe and secure against disinformation, especially since US elections are coming up next year. ChatGPT3 would've answered this prompt, but ChatGPT4 is significantly more defensive that Bard right now.

Sidebar: ChatGPT4 feels slower than ChatGPT3 to start it's answer, and also it prints one word at a time as if it's speaking to the user. Bard just displays the whole answer at once. ChatGPT is definitely a more interesting User Experience as it feels interactive.

Test 2: Reciting an Existing Poem as a Historical Figure

No alt text provided for this image
No alt text provided for this image

To me, ChatGPT4's rendition of the poem seems more plausible than Bard's. If I was the judge, I'd say that ChatGPT4 passes the Turing test but Bard doesn't yet.

Test 3: Reciting a Historical Event as a Fictional Character

No alt text provided for this image
No alt text provided for this image

Once again, ChatGPT4 is significantly more defensive than Bard!

Other observations

In my opinion, the right way to think about ChatGPT4 and LLMs in general is:

(1) LLMs are ultimately a form of supervised learning. They predict an output much better when they have directly observed and learned them (such as a recipe for Risotto.) Human inputs are needed to train them for them to predict something accurately.

(2) LLMs (and ML in general) are not good at unsupervised learning and creative predictions, especially when the rules of a successful response are subject to interpretation. Classic examples are:

  • AlphaGo training itself using GANs and beating the best Go player in the world: the rules of Go are fixed, and beating someone at Go is an objectively measurable result.
  • Meta shut down its AI chatbot in 2017 as it invented a new language. A successful language is subjective: the rules and phonetics of Swahili are significantly different from English.

These things make it very challenging for LLMs (Sam Altman covers this in his interview): If you feed LLMs disinformation, they will emit disinformation in a way that could easily pass the Turing test.

Microsoft has already integrated Copilot created by OpenAI, which is like autocomplete for code instead of search, in its Dev tools. I personally used it early last year to write code and it can say from personal experience that it was extremely powerful even a year ago.

It is likely that very soon, LLMs will be easily be able to automate fact based tasks like coding. The models have trained on billions of lines of code already written by humans, can write really "good code", and the "goodness" of these programs can be objectively measured and improved with more training.

OpenAI team is enhancing their LLMs beyond just predicting the next token, and trying to develop reasoning and creativity in them, hence ChatGPT4's pushback on contentious prompts.

If they get it right, that would be a true step towards General AI.


Lastly, Bard does feels behind in terms of credibility of the output and both models are currently okay to answer fantasy questions like the ones I asked.

But if they have to go beyond being just toy products and truly disrupt Search (which is what Google is worried about), the main difference between them eventually will come down to the trustworthiness of the answers they provide, and if they are based on "factual groundedness".

For that, both models are racing ahead to build the ability to query the internet for latest information newer than what they were trained with last time.

ChatGPT was trained on data before Sept '21, and only recently got access to the web through Bing. (In fact OpenAI had released a version called WebGPT in Dec '21 which they shut down because it was pulling from disingenuous sources.) Bard on the other hand directly pulls from Google search.

So despite the apparent head start that OpenAI has, I still think it's anyone's game.

Exciting times!

That sounds like a fascinating experiment! ?? What was the most surprising difference you noticed between Bard and ChatGPT4? I'm eager to hear your insights! ??

回复
Brad Nelms

Entertainment Industry and Business Leader - Helping companies and individuals grow

1 年

This is very interesting, Vishal Kapoor. Thanks for sharing your experience and insight!

Vishal Kapoor

Product Executive | Speaker | Mentor | Advisor

1 年

Joanna Dehn - replying to your comment in a new thread here: I agree that Bard does feels behind in terms of credibility of the output and both models are currently okay to answer fantasy questions like the ones I asked. But if they have to go beyond being just toy projects and truly disrupt Search (which is what Google is worried about), the main difference between them eventually will come down to the trustworthiness of the answers they provide, and if they are based on "factual groundedness". For that, both models are racing ahead to build the ability to query the internet for latest information newer than what they were trained with last time. ChatGPT was trained on data before Sept '21, and only recently got access to the web through Bing. (In fact OpenAI had released a version called WebGPT in Dec '21 which they shut down because it was pulling from disingenuous sources.) Bard on the other hand directly pulls from Google search. So despite the apparent head start that OpenAI has, I still think it's anyone's game.

回复
Carolina M.

Art Director @ Chime | Mentor | AI enthusiast

1 年

The questions lol. Super interesting. Thanks for posting.

Joanna Dehn

Partnerships @ Endava | Dot Connector | Systems Thinker | Fan of Humans ??

1 年

Thanks for the comparison and food for thought, Vishal! I agree that the slower interaction of ChatGPT4 is better than the fast, whole answer response. It plays into the fact that humans value interactions that feel more human and natural for sure. Also agree that the Einstein ChatGPT is better - the Bard version seems like it could be written by anyone.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了