Can You Trust ChatGPT?
Do you trust the algorithm? It's an intriguing question as we consider the possibilities of machine learning and artificial intelligence. Some might even say it's a question of faith, as we don't fully understand some of the models and their results.
Background
For the past few weeks, I've been experimenting with ChatGPT, the chatbot from OpenAI that is built on a family of large language models. I'm left with one big question –?in its current form, can we trust the algorithm?
It's been difficult to avoid the wave of ChatGPT prompts and learnings, it's being discussed everywhere and looks like it may soon be integrated into Microsoft Bing.
ChatGPT and generative art algorithms like Dall-E and Midjourney represent new tools that we should all become more comfortable using. There is no doubt additional use cases will continue to emerge, as will new models.
However, many people forget that ChatGPT isn't necessarily creating new thoughts –?it's all generative and trained on human-created content. I wonder, what will happen if it continues to train itself against algorithmically created content?
The Opportunity
There is no shortage of prompts and examples from ChatGPT doing phenomenal things, but I wondered just how accurate the algorithm was with a pretty benign data set.
As a baseball fan, I hopped back into the world of collectibles before the pandemic. Reliving my youth, I turned to the booming card industry. I collect The Topps Company Living Set, which is an ever-growing collection of hand-painted baseball cards.
As a goal and legacy, I'm trying to have every living player sign their card. Being the geek that I am, I've also organized the collection and player data into a Google Sheet.
Testing the Accuracy of ChatGPT
As part of the Google Sheet, I've slowly been updating personal information about each player to track whom I may need to prioritize based on age.
What would take me hours to aggregate manually or to develop a script for looked like an excellent use case for ChatGPT.
I started with a relatively easy prompt, "can you give me the birth date and death date of the following baseball players in table format?"
I was initially surprised with how quickly the information was generated, and the spot checks I was doing seemed to indicate the data was correct. But I was unwilling to stop there.
领英推荐
While the algorithm seemed to save me a ton of time collecting this information, I saw two errors that made me question the results. Initially, ChatGPT indicated that Hall of Fame inductees Jack Morris and Fergie Jenkins were dead.
I knew that wasn't true, but it sent me to Google to confirm my suspicions. Upon discovering the issue, I noticed that many of the younger players on the list had suspect birthdays as well.
Specifically, I dug into the birthday of Bobby Witt Jr. and even after correcting the model, still received incorrect information. Further, as I tried to understand where the information came from ChatGPT couldn't provide an answer.
The Results
There are currently 580 players on my checklist that I prompted ChatGPT to provide the birth date and death date for.
I then cross-referenced every player manually with their entry on Baseball Reference. I was shocked by my findings.
It was eye opening to see more than a third of the list being misrepresented. While many people have correctly celebrated the abilities of ChatGPT, but it's accuracy on factual data is quite concerning.
Further, it would be helpful if we could understand the source of the information or have ChatGPT provide a confidence score with fact-based results.
Perhaps that feature will be coming soon, if not I'd recommend taking great care with similar prompts.
It isn't premature to evangelize such tools or consider the implications of such tools, this just illustrates the growth that these tools and this space still need.
The question remains, can we trust the algorithms that could, or do, rule our world? It has many parallels to the faith people put into religion and I'm curious how much we'll just accept the answers provided vs. understand where they are coming from.
Proceed with care, and caution.
?? Wicked Problem Solver ?? How to Use Communication for Social Impact ??
2 年Well if it relies on Google then it's bound to be suspect. However, I don't think it will be long before some quality control is introduced. EG You could tell it to only use peer reviewed material, or only material from respected or authorized institutions; or ignore all Facebook and social media sources. ??
Senior Project Manager
2 年Interesting article. Thank you for sharing. On the other side, a read an article from a programmer who wondered if ChatGPT could write a piece of malicious code -- which is did in seconds.
This is really helpful insight Dennis. I've been playing around with it too and think it's a nice additional tool to have in the toolbox but, to your point, not one to solely rely on. I'd be interested to see if any of the incorrect data you caught gets corrected as ChatGPT learns more. Might be worth it to run the same ask by it in the coming months.
FIU Family Mediation Instructor; Conflict Coach; Author;
2 年Thanks for your careful task of checking results provided by ChatGPT. This is an essential task, that can help us break the fascination with AI....