Ensuring your data isn’t garbage in the ChatGPT era
2023 was undoubtedly the year in which OpenAI's ChatGPT came into its own. The technology turned more than one industry on its head as many companies tried to grapple with what it means and how to best use it.
ChatGPT's adoption, by individuals and organisations, alike, was startling, with the ChatGPT website having attracted a staggering almost 1.5 billion visitors per month, in less than a year following its launch.
?
While many individuals turned to it to generate creative content, corporate users found value in using the AI to enhance various business processes. These ran the gamut from addressing customer service enquiries and drafting emails, to doing personal assistant tasks, and creating presentations.?
?
Along with scores of eager users, Chat GPT also brought some serious concerns. As noted by Forbes, along with security and ethical concerns, a core issue is whether AI can be trusted to be right all the time. If people and companies are turning to AI tools to provide answers - which they are - they need to be able to trust that those answers given are accurate and genuinely useful.
?
With AI, as with data analytics, the old saying of “garbage in equals garbage out” still holds true, and the viability of any output is going to depend on the quality of data that the tool is drawing from.
?
As my colleague, Willem Conradie, the CTO of PBT Group stresses, addressing such data quality concerns are paramount for several reasons. If ChatGPT is allowed to draw answers from poor quality data, then the results can range from biased outputs and inconsistent answers to answers that are devoid of empathy. Worse yet, it can even result in creating more security issues to contend with.
?
Exacerbating this, as the year progressed, AI tools were found to simply make things up, to the extent that the propensity for AI to concoct answers was given a name: hallucinating. No one wants an important business decision to be based on a hallucination.
?
This doesn't mean that ChatGPT and other AI tools should be dismissed or ignored. But it does mean that we need to take care when using the technology and pay particular attention to data integrity.
领英推荐
?
To navigate the pitfalls of the AI's veracity or rather, lack thereof, many are touting the importance of Responsible AI. This, Willem elaborates, entails ensuring that AI is applied with fair, inclusive, secure, transparent, accountable, and ethical intent. Responsible AI is also intended to address the real concern of ChatGPT dishing out fabricated, or just incorrect, information.
?
But Responsible AI on its own may not be enough. We also need to hold ChatGPT, and the data it relies on, to a set of standards.
?
These include ensuring that the data ChatGPT is given is relevant. Willem explains that this requires the data used for model training to align with the business context in which ChatGPT operates.
?
Secondly, timeliness of data is essential, as outdated data could easily result in inaccurate information. Thirdly, any data that ChatGPT draws from must be complete. Datasets must not have missing values, duplicates, or irrelevant entries, as drawing from incomplete data can easily produce incorrect responses and actions.
?
Willem stresses that finally, it is critical that data models are continuously improved through reinforcement learning, and by incorporating user feedback into model retraining cycles.
?
Following these guidelines will help ensure that ChatGPT in particular, and conversational AI models in general, are able to learn from their interactions, adapt, and enhance their response quality over time.
?
Then ChatGPT can be an AI that we can come to trust and foster certainty that the responses it generates are consistently valid and usable.
Host of 'The Smartest Podcast'
7 个月Excited to dive into your insights on data quality in the age of ChatGPT! ??
Senior Managing Director
7 个月Henschel Kok Very insightful.?Thanks for sharing.
Exciting to witness the exponential growth of ChatGPT adoption! Your blog brings a crucial reminder about the significance of data quality in unlocking true value in the era of AI.
Helping 15,000+ Founders Discover the Best AI & SaaS Tools for Free | Founder of SaaS Gems ?? | Curated Tools & Resources for AI & SaaS Founders ??
7 个月Excited to dive into your insights on data quality in the ChatGPT era! ??
????Vom Arbeitswissenschaftler zum Wissenschaftskommunikator: Gemeinsam für eine sichtbarere Forschungswelt
7 个月Looking forward to reading your insights on data quality management in the ChatGPT era! ??