I asked chatGPT about Data Errors

I asked chatGPT about Data Errors

There has been a lot of talk lately about OpenAI's conversational AI beta, known at chatGPT. While not a truly "intelligent" agent, it does provide an uncanny distillation of a vast quantity of information and provides some meaningful insights. As I am regularly looking for ways to help communicate the value of testing and test automation to executives, I decided to ask chatGPT some basic questions:

Give me some examples of software errors that had large commercial impact

There have been many examples of software errors that had large commercial impact. Some examples include:

  • In 2012, Knight Capital Group lost $460 million due to a software error that caused the company's automated trading systems to malfunction.
  • In 2018, British Airways was hit with a $230 million fine for a data breach caused by a software error that exposed the personal data of half a million customers.
  • In 2020, the Mariner East 2 natural gas pipeline in Pennsylvania was shut down for several months due to a software error that caused incorrect pressure readings, resulting in significant financial losses for the company.
  • In 2021, a software error in the payment processing system at Target caused a nationwide outage, leading to lost sales and customer frustration.
  • In 2022, a software error in the voting systems used in the U.S. presidential election led to widespread confusion and allegations of voter fraud, resulting in significant political and financial fallout.

These are just a few examples of the potential impact that software errors can have on businesses and other organizations.?

Provide some additional examples focusing on data errors

Here are some additional examples of software errors that had large commercial impact, specifically focusing on data errors:

  • In 2013, a software error at the Nasdaq stock exchange caused a three-hour shutdown and led to the loss of millions of dollars in trading revenues.
  • In 2014, a software error at JPMorgan Chase led to the loss of $6 billion in value due to incorrect data in the bank's risk model.
  • In 2016, a software error at the U.S. Internal Revenue Service resulted in the agency incorrectly reporting the incomes of hundreds of thousands of taxpayers, leading to incorrect tax payments and refunds.
  • In 2017, a software error at Equifax exposed the personal data of 143 million customers, leading to significant financial losses and damage to the company's reputation.
  • In 2019, a software error at Capital One resulted in the personal data of more than 100 million customers being accessed by a hacker, leading to a fine of $80 million.

These examples show the potential impact that data errors can have on businesses and other organizations, both in terms of financial losses and damage to reputation.?

Provide some more examples

Here are some more examples of software errors that had large commercial impact:

  • In 2015, a software error at the U.S. National Weather Service caused incorrect weather data to be broadcast, leading to inaccurate weather forecasts and potentially dangerous situations for the public.
  • In 2017, a software error at the U.S. Federal Aviation Administration caused several hundred flights to be grounded, leading to significant delays and disruptions for travelers.
  • In 2019, a software error at the Australian Securities and Investments Commission resulted in the incorrect reporting of company names and financial information, leading to confusion and potential financial losses for investors.
  • In 2020, a software error at the U.S. Department of Defense led to the loss of sensitive military data, potentially compromising national security.
  • In 2021, a software error at the U.S. Centers for Disease Control and Prevention caused incorrect COVID-19 data to be reported, leading to confusion and potential harm to public health.

These examples show the potential impact that software errors can have in a wide range of industries and organizations, from finance and technology to government and public health.


Just to be clear, in no way did I prompt the references to COVID or the the 2022 elections!


What are some effective ways to prevent such things from occurring in the future?

There are several effective ways to prevent commercially expensive data errors from occurring in the future. These include:

  1. Implement thorough testing procedures: Thorough testing is essential to identify and fix errors before they are released to the public. This should include both manual and automated testing, with a focus on testing the most critical and error-prone parts of the software.
  2. Conduct regular code reviews: Code reviews provide an opportunity for other team members to identify and correct errors in the code before it is released. This can help prevent errors from slipping through the testing process and being introduced into the live software.
  3. Use error-handling and exception-handling techniques: Error-handling and exception-handling techniques can help prevent software from crashing or behaving unexpectedly when errors are encountered. This can help prevent data errors from occurring and minimize the impact of errors that do occur.
  4. Implement strong data validation and data cleansing procedures: Data validation and data cleansing techniques can help ensure that the data entered into the software is accurate and consistent. This can help prevent errors from occurring due to incorrect or incomplete data.
  5. Invest in robust backup and recovery systems: Having robust backup and recovery systems in place can help ensure that any data errors that do occur can be quickly and easily rectified. This can minimize the impact of data errors on the business and its customers.

Overall, the key to preventing commercially expensive data errors is to have a strong and comprehensive approach to quality assurance and error prevention. This should involve a combination of thorough testing, code reviews, error-handling techniques, data validation, and robust backup and recovery systems.

Are there commercially available tools to help in preventing expensive data fiascos??

Yes, there are many commercially available tools that can help prevent expensive data fiascos. These tools can be used to automate and streamline various aspects of the software development and testing process, including code reviews, error-handling, data validation, and backup and recovery. Some examples of commercially available tools that can help prevent expensive data fiascos include:

  • Code review tools: These tools can automate the code review process and provide alerts and suggestions for improving the code. Examples include CodeClimate and Reviewable.
  • Error-handling and exception-handling tools: These tools can automatically detect and handle errors and exceptions in the software, preventing it from crashing or behaving unexpectedly. Examples include Sentry and Exceptionless.
  • Data validation and cleansing tools: These tools can help ensure that the data entered into the software is accurate and consistent. Examples include Talend and Informatica.
  • Backup and recovery tools: These tools can automatically create backups of the software and its data, and provide tools for quickly and easily restoring the data in the event of an error. Examples include Veeam and Rubrik.

Overall, commercially available tools can help prevent expensive data fiascos by automating and streamlining various aspects of the software development and testing process. By using these tools, businesses can ensure that their software is of high quality and free of errors that could potentially cause significant financial losses.


These are some amazing examples of the real world implication and financial impact that software and data errors can cause. Yet as brilliant as chatGPT might appear, you should notice that it provided no suggestion for a commercially available tool to address its #1 solution for preventing these issues..."thorough testing procedures". Luckily we humans still have some value to add. And since chatGTP can only suggest solution that are broadly discussed, it's lucky you read this far...because Tricentis Data Integrity testing is the solution to this key challenge.

Here is a great 5 minute video providing an overview of Data Integrity testing from Tricentis.

Let me know what you think of chatGPT's suggestions, the future of this intriguing technology and if it made any mistakes!

Shubhodaye Hiremath

A happy Software Tester on this planet earth ??!

2 年

ChatGPT is my favourite pastime:)! I generally ask questions to ChatGPT as, "when you are you to take over homo sapiens?" , "Do you love anyone", and yes recently I asked ChatGPT to write a C program for doubly circular link list. It has written a neat code with good indentation n comments intact. * I am yet to verify that code if it works... I am liking the way we humans are curious about ChatGPT. I foresee that, it is a revolution in progress.... I meant the way we search and interact with search engines going to go through paradigm shift... May be...

回复

Speaking of data errors… In the post, ChatGPT's third story is about the 2020 incident in which the Mariner East 2 natural gas pipeline in Pennsylvania was shut down for several months due to a software error. I was unable to find anything in a regular Google search or a Google News search to suggest that anything of the sort ever happened. In other words: ChatGPT seems to have made that "fact" up. This isn’t the first time it has done something similar; it misrepresents consistently. People seem to be awfully blasé about machinery that produces plausible-sounding untruths at scale.

Mike White

Data Specialist | Analytics Extraordinaire | Software Evangelist

2 年

I quite enjoyed this insightful chat. :) Great post Tom!

要查看或添加评论,请登录

Tom Payne的更多文章

  • A Tale of Dirty Data

    A Tale of Dirty Data

    A while back, our Accounts Receivable team asked for help tracking down a past-due invoice. After two weeks or so, we…

  • Open Source is Great…until it Isn't

    Open Source is Great…until it Isn't

    I sell commercial test automation software. On a daily basis, I compete against Selenium, an open source alternative.

  • The Reality of Testing

    The Reality of Testing

    When discussing the process of software testing, we typically start with the idea of a "requirement". This is then sent…

  • Selenium pitfalls

    Selenium pitfalls

    In the classic movie, Monty Python and the Holy Grail, King Arthur has an encounter with “Dennis” regarding their form…

    2 条评论
  • The Tosca Difference

    The Tosca Difference

    When I first joined Tricentis, Tosca had the reputation for being an excellent test automation solution, but I honestly…

    3 条评论
  • Clarity and the Agile Testing Pyramid

    Clarity and the Agile Testing Pyramid

    I have been in the software testing space for the past 20+ years and never experienced greater clarity regarding the…

  • Diversity & Inclusion: Actionable Tech Hiring Ideas

    Diversity & Inclusion: Actionable Tech Hiring Ideas

    Diversity and inclusion are on the mind of every HR professional these days. The challenge is identifying actionable…

    2 条评论
  • Leveraging my Network

    Leveraging my Network

    LinkedIn is a great concept, in theory. Building and maintaining a robust network should be highly beneficial to all…

    2 条评论
  • Tales from the Wild: The "Key" to Good Customer Experience

    Tales from the Wild: The "Key" to Good Customer Experience

    I have been traveling quite a bit lately and was looking forward to finally trying out the new "use your phone as a…

    1 条评论
  • Tales from the Wild: Testing Big Ideas

    Tales from the Wild: Testing Big Ideas

    In their book, Sprint, Google Venture consultants Jake Knapp, John Zeratsky and Braden Kowitz describe how they came to…

    2 条评论

社区洞察

其他会员也浏览了