GPT-3 + Data = Nonsense

GPT-3 + Data = Nonsense

Everyone is talking about ChatGPT and how it will change the way people work. The real estate industry is no exception. To recap: ChatGPT is a tool created by a company called OpenAI that can converse in English to generate text or answer a variety of questions. However, when it comes to creating content that is data-heavy, ChatGPT (and GPT-3, the platform on which it is built) tends to not work very well. Let me explain. In its own words, GPT-3 “is trained to understand and generate human language. It is not specifically designed to work with data in the sense of numerical or statistical analysis.”?

The result – the technology can’t be used out of the box by content creators in data-intensive industries like real estate. For example, if a creator’s goal were to draft a piece of long-form content detailing the latest rental trends for a given zip code, ChatGPT will return a response similar to this:

No alt text provided for this image

As impressive as that looks at first glance, the numeric data cited in this output is nonsensical making the content unusable by any serious real estate professional. Asking Chat GPT where it found those figures will return a response like this “The information I provided on rental trends in Chicago's 60642 zip code was based on my general knowledge and understanding of the real estate market, but it is not specific numbers from any source.”

No alt text provided for this image


In a world where publications like the NYTimes have gotten readers to expect data embedded within all content, what GPT provides out of the box simply doesn’t cut it for data-heavy use cases like real estate. So the problem becomes: how can we extend GPT’s impressive language skills with accurate data and real insights from multiple live sources??

This is exactly the problem we are solving here at Dataherald. The solution works in the following steps:

  1. We extract the intent of the prompt entered by the user. We fine-tune our own model on top of GPT to do this efficiently against the data in our warehouses
  2. We query our data warehouses to get accurate numbers for the users prompt
  3. We prompt GPT using the accurate data, also embedding an interactive data visualization to accompany the final text??

Here is the solution in action


No alt text provided for this image

Let’s talk: https://www.dataherald.com/contact-us

要查看或添加评论,请登录

Dataherald (YC W21)的更多文章

社区洞察

其他会员也浏览了