Can You Trust ChatGPT?
Photo by Markus Spiske on Unsplash

Can You Trust ChatGPT?

Do you trust the algorithm? It's an intriguing question as we consider the possibilities of machine learning and artificial intelligence. Some might even say it's a question of faith, as we don't fully understand some of the models and their results.

Background

For the past few weeks, I've been experimenting with ChatGPT, the chatbot from OpenAI that is built on a family of large language models. I'm left with one big question –?in its current form, can we trust the algorithm?

It's been difficult to avoid the wave of ChatGPT prompts and learnings, it's being discussed everywhere and looks like it may soon be integrated into Microsoft Bing.

ChatGPT and generative art algorithms like Dall-E and Midjourney represent new tools that we should all become more comfortable using. There is no doubt additional use cases will continue to emerge, as will new models.

However, many people forget that ChatGPT isn't necessarily creating new thoughts –?it's all generative and trained on human-created content. I wonder, what will happen if it continues to train itself against algorithmically created content?

The Opportunity

There is no shortage of prompts and examples from ChatGPT doing phenomenal things, but I wondered just how accurate the algorithm was with a pretty benign data set.

As a baseball fan, I hopped back into the world of collectibles before the pandemic. Reliving my youth, I turned to the booming card industry. I collect The Topps Company Living Set, which is an ever-growing collection of hand-painted baseball cards.

As a goal and legacy, I'm trying to have every living player sign their card. Being the geek that I am, I've also organized the collection and player data into a Google Sheet.

Testing the Accuracy of ChatGPT

As part of the Google Sheet, I've slowly been updating personal information about each player to track whom I may need to prioritize based on age.

What would take me hours to aggregate manually or to develop a script for looked like an excellent use case for ChatGPT.

I started with a relatively easy prompt, "can you give me the birth date and death date of the following baseball players in table format?"

I was initially surprised with how quickly the information was generated, and the spot checks I was doing seemed to indicate the data was correct. But I was unwilling to stop there.

While the algorithm seemed to save me a ton of time collecting this information, I saw two errors that made me question the results. Initially, ChatGPT indicated that Hall of Fame inductees Jack Morris and Fergie Jenkins were dead.

I knew that wasn't true, but it sent me to Google to confirm my suspicions. Upon discovering the issue, I noticed that many of the younger players on the list had suspect birthdays as well.

Specifically, I dug into the birthday of Bobby Witt Jr. and even after correcting the model, still received incorrect information. Further, as I tried to understand where the information came from ChatGPT couldn't provide an answer.

No alt text provided for this image

The Results

There are currently 580 players on my checklist that I prompted ChatGPT to provide the birth date and death date for.

I then cross-referenced every player manually with their entry on Baseball Reference. I was shocked by my findings.

  • 32%, or 188, of the players had incorrect information
  • 23%, or 139, of the players had birth dates that were off my months or years
  • 4%, or 25, of the players had a birth date that was exactly off by one year
  • 3%, or 18, of the players had a birth date that was off by a matter of days
  • Five players were incorrectly reported as having passed including Jack Morris, Fergie Jenkins, Rod Carew, Carlton Fisk, and David Ortiz

It was eye opening to see more than a third of the list being misrepresented. While many people have correctly celebrated the abilities of ChatGPT, but it's accuracy on factual data is quite concerning.

Further, it would be helpful if we could understand the source of the information or have ChatGPT provide a confidence score with fact-based results.

Perhaps that feature will be coming soon, if not I'd recommend taking great care with similar prompts.

It isn't premature to evangelize such tools or consider the implications of such tools, this just illustrates the growth that these tools and this space still need.

The question remains, can we trust the algorithms that could, or do, rule our world? It has many parallels to the faith people put into religion and I'm curious how much we'll just accept the answers provided vs. understand where they are coming from.

Proceed with care, and caution.

Mark Dean Garner

?? Wicked Problem Solver ?? How to Use Communication for Social Impact ??

2 年

Well if it relies on Google then it's bound to be suspect. However, I don't think it will be long before some quality control is introduced. EG You could tell it to only use peer reviewed material, or only material from respected or authorized institutions; or ignore all Facebook and social media sources. ??

回复
Steve Buss, PMP, CSM

Senior Project Manager

2 年

Interesting article. Thank you for sharing. On the other side, a read an article from a programmer who wondered if ChatGPT could write a piece of malicious code -- which is did in seconds.

回复

This is really helpful insight Dennis. I've been playing around with it too and think it's a nice additional tool to have in the toolbox but, to your point, not one to solely rely on. I'd be interested to see if any of the incorrect data you caught gets corrected as ChatGPT learns more. Might be worth it to run the same ask by it in the coming months.

回复
Nora Femenia, Ph.D.

FIU Family Mediation Instructor; Conflict Coach; Author;

2 年

Thanks for your careful task of checking results provided by ChatGPT. This is an essential task, that can help us break the fascination with AI....

要查看或添加评论,请登录

Dennis Jenders的更多文章

  • Top Super Bowl LIX Commercials

    Top Super Bowl LIX Commercials

    The cheers have faded, the confetti has settled, and Super Bowl LIX is in the books as the Philadelphia Eagles pummeled…

    4 条评论
  • From the Lights of Paris to the Stars of LA: Preparing for 2028

    From the Lights of Paris to the Stars of LA: Preparing for 2028

    While the Closing Ceremonies in Paris are still a week away, it’s never too early to start thinking about LA28. Paris…

    1 条评论
  • These Are Not the Trends You're Looking For

    These Are Not the Trends You're Looking For

    This article isn't an indictment of the plethora of reflections, trends, and predictions for 2023. Rather, it's a…

    1 条评论
  • The Future of Influence, Micro Experiences

    The Future of Influence, Micro Experiences

    This is part three of a three-part series looking at the future of influence. You can read part two here.

    1 条评论
  • The Future of Influence, Creative Collaboration

    The Future of Influence, Creative Collaboration

    This is part two of a three-part series looking at the future of influence. You can read part one here.

  • The Future of Influence, A New Framework for Talent

    The Future of Influence, A New Framework for Talent

    This is part one of a three-part series looking at the future of influence. The global pandemic and U.

  • Who Are Your Company Sherpas?

    Who Are Your Company Sherpas?

    Do You Have a Company Sherpa? I’ve been thinking a lot about how the c-suite develops a vision for company growth. I’m…

    4 条评论
  • Unexpected Acts of Kindness

    Unexpected Acts of Kindness

    Sincere acts of kindness are endangered. In our hyper-connected world we move too fast to slow down and take the time…

    2 条评论
  • Send the Elevator Back Down

    Send the Elevator Back Down

    Our success isn’t just our own. It’s a culmination of dedication, happenstance, hard work, loving support, sometimes a…

    7 条评论
  • Women in Advertising

    Women in Advertising

    Happy #WomensDay. Unfortunately, I work in an industry where women too often experience harassment and discrimination.

社区洞察

其他会员也浏览了