登录查看更多内容

ChatGPT-4 Versus Decision Analyst’s Deep Learning Model By John V. Colias, Ph.D.

Decision Analyst

Strategic Marketing Research and Advanced Analytic Services

发布日期: 2024年4月9日

With all the hype and hoopla over generative AI, we decided to do some experimental work to see how well ChatGPT-4 performed versus Decision Analyst’s Deep Learning Model, a multi-layer neural network classification model.

ChatGPT evolved out of deep learning methods that appeared in the early 2010’s. Since then, deep learning models have achieved prominence in image classification, speech recognition, improved search results on the web, and more recently, the ability to understand text questions and to answer in natural language (“prompt” and the “response”).

Our investigation addresses the important question of how should generative AI models like ChatGPT be used in the marketing research industry? More specifically, how well do ChatGPT responses align with human-produced responses? Is additional modeling needed to further align ChatGPT and human-produced responses?

In survey-based research, open-ended questions are frequently included, and the responses to these open-ended questions are typically coded by hand. That is, a real-live person (an analyst) reads each of the answers to open-ended questions and assigns a numeric code (sometimes called a label) to each unique idea in the text answer. This process is also called Content Analysis, a widely used analytical method favored by intelligence agencies around the world to mine a deeper understanding of content published by competitive countries. Survey open-ended questions are used in marketing research, social research, and political research, and all the answers must be coded (or labeled) by a thinking, intelligent human being.

Human coding of open-ended responses is labor intensive and very expensive, so we decided to see if ChatGPT-4 could accurately assign codes (or labels) to the answers to open-ended questions. As a point of comparison, we used a Decision Analyst Deep Learning Model to code the same dataset.

We asked Nuance (Decision Analyst's coding and text analytics subsidiary) to assign human codes (or labels) to 2,000 answers for the open-ended question: “In your opinion, what are the major economic problems in your country? Please give as much detail as possible.” The data included responses from the US, Canada, India, UK, Australia, New Zealand, and the Netherlands. The responses were all provided in English.

Then, we supplied ChatGPT-4 with the text of the human-produced codes (the prompt) and asked it to assign codes or labels to the same 2,000 answers. Next, we trained our Deep Learning Model on a random sample of 500 human-coded answers, and then randomly selected 1,000 human-coded answers from the 2,000 dataset (excluding the 500 records chosen as the training dataset). Our Deep Learning Model then coded the answers in these same 1,000 records. ChatGPT-4 and Decision Analyst’s Deep Learning Model yielded the following results for the same 1,000 answers. The percentages in the following chart assume that the human-coded results are the “Gold Standard” (that is, 100% correct).

Airswift 1 年前

The AI Vanguard Newsletter #5

Danny Butvinik 1 年前

Business Applications of Deep Learning and LLM's

Ganapathy (Krish) Krishnan 9 个月前

Decision Analyst's Deep Learning Model vs ChatGPT-4

Clearly, ChatGPT-4 performed better than Decision Analyst’s Deep Learning Model. However, the gap in performance declined for codes or labels with higher incidence. For codes with at least 10% incidence, the sensitivity gap shifted from a significant "win" for ChatGPT-4 (51% versus 19%) to Decision Analyst's Deep Learning Model outperforming ChatGPT-4 by 3 percentage points (75% versus 72%).

It was expected that ChatGPT-4 would outperform Decision Analyst’s Deep Learning Model, since the former was developed using a massive number of texts to train it to understand the text responses and the meaning of the codes or labels. Indeed, the ChatGPT-4 embedding vectors, the numeric representation of the meaning of the text, provided most of the advantage. To demonstrate this point, we trained another Decision Analyst deep learning model to use ChatGPT-3.5 embedding vectors as predictors. This second deep learning model performed admirably and significantly outperformed ChatGPT-4 in sensitivity for higher-incidence codes.

Decision Analyst's Deep Learning Model with ChatGPT 3.5 vs ChatGPT-4

None of the AI and deep learning systems are perfectly accurate. Neither system produces results that perfectly align with human-produced results.

A key benefit of using ChatGPT-4 is that no modeling is required—there was only a prompt that included (a) the text of the primary and secondary categories (i.e., the code nets and the codes), (b) the text of the question response, and (c) a request to assign appropriate primary and secondary categories to the response.
Evidence suggests that ChatGPT-4 prompt-response output can benefit from additional modeling to further align results with human-produced results.
For ChatGPT-4, sample size (the total number of survey respondents who answered the question) did not matter since no additional modeling was done. That is, ChatGPT-4 uses a massive model with billions of parameters, so no additional modeling using survey data is needed.
Interestingly, ChatGPT-4 performance also improved with higher incidence. This would be an area for further investigation.

We should point out that human-produced code assignments (given a code book also human produced) were assumed to be “truth.” One might suggest that humans could be subject to error and bias. On the other hand, one might suspect the same flaws in ChatGPT-3.5 and ChatGPT-4.

About the Author

John Colias ([email protected]) is a Senior VP Research & Development at Decision Analyst. He may be reached at 1-800-262-5974 or 1-817-640-6166.

Marketing Research Advice

2,586 位关注者

Jerry W Thomas

Chief Executive Officer, Decision Analyst

6 个月

Thanks, John. Very interesting findings. Jerry

1 次回应

Jennifer Murphy

Marketing Research Generalist | Helping Businesses Focus on the Right Questions...and Answers

6 个月

Exciting work you are doing. I love hearing that we are researching this important topic. Improvements in our work are always needed...but interesting that Humans are still the best..for now.

1 次回应

查看更多评论

要查看或添加评论，请登录

ChatGPT-4 Versus Decision Analyst’s Deep Learning Model By John V. Colias, Ph.D.

Decision Analyst

Strategic Marketing Research and Advanced Analytic Services

领英推荐

About the Author

Marketing Research Advice

2,586 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

AI: Moving beyond the hype to answer questions every business leader should ask

Unveiling the Mechanism of ChatGPT: A Journey into Conversational AI

Accelerating AI Understanding: Five Essential Insights for Professionals

TOP 100 AI GLOSSARY

A Look at AI: Beyond ChatGPT and Into the Future

AI is More than ChatGPT: Phenomenal Cosmic Power! Itty Bitty Living Space!

How To Use Generative AI Tools Such as ChatGPT in Your Work?

Can AI read Minds?

ChatGPT 101: Basics

Implementation of AI in Digital Transformation and Business

领英推荐

About the Author

Marketing Research Advice

2,586 位关注者

Advertising Tracking Tips by Jerry W. Thomas

2024年9月25日

Qualitative Package Design Research by Jerry W. Thomas

2024年8月22日

Fruits And Produce Top List Of Most Popular Organic Foods, According To Study By Decision Analyst

2024年7月25日

7 iHUT (In-Home Usage Testing) Best Practices by Jerry W. Thomas

2024年7月24日

Pricing Research: The Good, The Bad, And The Good Enough by Elizabeth Horn, Ph.D.

2024年6月13日

Virtual Ideation Workshops: 5 Tips For Success by Kelly Sons

2024年5月16日

Seven Sets of Questions Every Brand Manager Must Answer by Jerry W. Thomas

2024年4月30日

Pricing Research: The Good, The Bad, And The Good Enough by Elizabeth Horn

2024年4月2日

The Strategic Value of Qualitative Research

2024年1月24日

6 Guidelines For New Product Success

2023年12月13日

社区洞察

其他会员也浏览了

AI: Moving beyond the hype to answer questions every business leader should ask

Unveiling the Mechanism of ChatGPT: A Journey into Conversational AI

Accelerating AI Understanding: Five Essential Insights for Professionals

TOP 100 AI GLOSSARY

A Look at AI: Beyond ChatGPT and Into the Future

AI is More than ChatGPT: Phenomenal Cosmic Power! Itty Bitty Living Space!

How To Use Generative AI Tools Such as ChatGPT in Your Work?

Can AI read Minds?

ChatGPT 101: Basics

Implementation of AI in Digital Transformation and Business