登录查看更多内容

How ChatGPT's Changing Behavior Will Affect Backend Services

Junior Williams

Security Architect and AI Researcher

发布日期: 2023年7月26日

The conversational AI ChatGPT has rapidly been adopted by individuals and businesses for a myriad of applications, from creative writing to customer service. However, new research reveals that ChatGPT's performance and outputs have been changing substantially from month to month. For the many web and mobile services using ChatGPT's API in their backend stack, these shifting behaviours could seriously impact reliability and functionality.

ChatGPT Performance is Drifting on Key Tasks

A recent analysis by researchers at Stanford and UC Berkeley evaluated different versions of the GPT-3.5 and GPT-4 models that power ChatGPT. They tested the March 2023 and June 2023 versions on tasks like:

Math problem solving
Answering sensitive questions?
Generating executable code
Visual reasoning

The study found major differences between the two versions. For example:

GPT-4's math accuracy plummeted from 97.6% to just 2.4% between March and June.
The percentage of directly executable Python code generated dropped from 52% to 10% for GPT-4.
GPT-4 answered far fewer sensitive questions in June, but with less explanation.

In short, the outputs and capabilities of the "same" ChatGPT models changed significantly within a span of just 3 months.

How Backend Services Rely on Stable ChatGPT Outputs

Many modern web and mobile applications now incorporate ChatGPT into their backend infrastructure to power critical parts of the user experience:

Customer service chatbots
Content generation like marketing copy and reports
Answering customer questions about orders or products

These systems often depend on getting consistently high-quality outputs from ChatGPT's API for core functionality:

Certain accuracy rates on key user questions
Predictable language and terminology?
Appropriate refusal to engage with dangerous queries

However, with ChatGPT's behavior changing rapidly, these assumptions are no longer safe. The error rates and suitability of responses for key use cases can now shift dramatically from one month to the next.

Steps Services Can Take to Adapt?

Companies using ChatGPT's API in production need to take steps to safeguard functionality given these reliability risks:

Continuously monitor ChatGPT outputs with a suite of test queries, checking for performance drift.
Have human confirmation or reviews for any high-risk ChatGPT outputs.
Use multiple different LLMs and combine outputs for more robustness.
Implement hybrid systems with rules-based and ML components. ?
Abstract ChatGPT behind an internal API with business logic and safety checks.
Plan to regularly update integration code if OpenAI frequently updates models.

What This Means for the Future

The findings highlight the challenges of building on external proprietary AI services with no stability guarantees. As LLMs continue rapidly evolving, service providers and consumers will need increased vigilance to support responsible LLM integration.

Conclusion

ChatGPT's shifting behavior underscores the need for thoughtful, continuously tested integrations by service providers. Ongoing monitoring and safeguards will be essential as conversational agents continue improving. Truly reliable, safe LLM-based services will require collaboration between AI creators and commercial users.

领英推荐

Introduction to ChatGPT-4o

Blockchain Council 5 个月前

Which AI Reigns Supreme? ChatGPT or Claude AI - The…

Joshua B. Lee 1 年前

Unleash Your Inner AI Bard: How to Make Your OWN…

Data Science AI Learner Community 11 个月前

References

[1] https://research.aimultiple.com/chatgpt-use-cases/

[2] https://poolmarketing.medium.com/the-dark-side-of-chatgpt-has-real-world-consequences-90bff03a00bf

[3] https://venturebeat.com/ai/not-just-in-your-head-chatgpts-behavior-is-changing-say-ai-researchers/

[4] https://towardsdatascience.com/decoupled-frontend-backend-microservices-architecture-for-chatgpt-based-llm-chatbot-61637dc5c7ea

[5] https://www.makeuseof.com/openai-chatgpt-biggest-probelms/

[6] https://cimatri.com/the-changing-behavior-of-chatgpt-over-time/

[7] https://ai.plainenglish.io/generative-ai-like-chatgpt-will-reshape-the-backend-stack-2ce242c5a9f5

[8] https://blog.pangeanic.com/final-thoughts-consequences-chatgpt-2023

[9] https://huggingface.co/papers/2307.09009

[10] https://talent500.co/blog/how-to-use-chatgpt-for-full-stack-development-a-comprehensive-guide/

[11] https://bgr.com/tech/chatgpt-gpt-5-everything-we-know-about-the-next-major-ai-upgrade/

[12] https://bootcamp.uxdesign.cc/supercharge-behavioural-science-with-chatgpt-how-ai-and-design-thinking-revolutionise-everything-d610b1ed548c

[13] https://medium.com/geekculture/create-a-customer-service-chatbot-with-chatgpt-api-184a0fc8ed55

[14] https://techcrunch.com/2023/07/25/chatgpt-everything-you-need-to-know-about-the-open-ai-powered-chatbot/

[15] https://medium.com/@jeffrey.james/the-impact-of-chatgpt-on-my-google-search-behavior-an-analysis-3de335ca189b

[16] https://www.itprotoday.com/artificial-intelligence/what-chatgpt-how-it-works-and-best-uses-chatbots

[17] https://www.youtube.com/watch?v=qdd17F9f5ms

[18] https://www.dhirubhai.net/pulse/transforming-project-management-impact-chatgpt-behavior-kaplan

[19] https://www.geeksforgeeks.org/what-is-chatgpt/

[20] https://www.pluralsight.com/blog/machine-learning/gpt-4-and-chatgpt-update

[21] https://www.dhirubhai.net/pulse/chatgpt-behavioral-health-game-changer-threat-paul-gulbin

[22] https://openai.com/blog/introducing-chatgpt-and-whisper-apis

[23] https://clp.law.harvard.edu/article/the-implications-of-chatgpt-for-legal-services-and-society/

[24] https://arxiv.org/abs/2307.09009

Ramar Smith

Director of Operations at RieVax driving IT efficiency and innovation

1 年

Why do you think this is happening?

Frank DeLeo

Technology Management Services

1 年

Well written ?????? Junior Williams, CISSP! the article is informative and helps clarify assumptions on using ChatGPT's API in their backend stack, and how these shifting behaviours could seriously impact reliability and functionality.

1 次回应

MANOJ SHARMA

Senior Cybersecurity Consultant, vCISO | Specialist in helping people pass CISSP | CISM | PCI-DSS Certifications

1 年

?????? Junior Williams, CISSP really insightful. Thanks for sharing. What role do you think Artificial Intelligence (AI) plays in enhancing #cybersecurity defenses? Would like to know more.

1 次回应

查看更多评论

要查看或添加评论，请登录

Junior Williams的更多文章

Resilience Under Pressure in High-Stakes Environments

2024年10月29日

Resilience Under Pressure in High-Stakes Environments

"More than half of senior security professionals (59%) in the UK are impacted by burnout, with 68% surveyed being…

8 条评论
Secure GenAI: Cybersecurity in the Era of Generative AI

2024年8月27日

Secure GenAI: Cybersecurity in the Era of Generative AI

“Innovation distinguishes between a leader and a follower." — Steve Jobs TL;DR:…

17 条评论
CTEM — The Future of Proactive Cybersecurity

2024年7月9日

CTEM — The Future of Proactive Cybersecurity

"Cybersecurity minutiae inside; forward to your favourite digital defender!” — Junior Williams ??Alert: may induce…

8 条评论
Responsible AI Implementation in Enterprise and Public Sector

2024年6月11日

Responsible AI Implementation in Enterprise and Public Sector

TLDR: ??????????????????????????????????????????????? One way that Artificial Intelligence (AI) can be leveraged…

7 条评论
Voice Cloning Conundrum: Navigating Deepfakes in Synthetic Media

2024年5月2日

Voice Cloning Conundrum: Navigating Deepfakes in Synthetic Media

TLDR: ?????????????????????????♂????? AI voice cloning enables stunningly realistic impersonation, posing critical…

17 条评论
Maximizing Cybersecurity with AI: A Comprehensive Guide to Applications, Strategy, Ethics, and the Future

2024年4月2日

Maximizing Cybersecurity with AI: A Comprehensive Guide to Applications, Strategy, Ethics, and the Future

TLDR: ?????????????????????????????? This article explores the transformative potential of AI in cybersecurity across…

17 条评论
Surprising Findings on the Power of Quirky AI Prompts

2024年3月10日

Surprising Findings on the Power of Quirky AI Prompts

Unlocking the Hidden Potential of LLMs Introduction Large language models (LLMs) have revolutionized the field of…

20 条评论
Finding Balance in Cyber: The Mandalas Approach

2023年12月24日

Finding Balance in Cyber: The Mandalas Approach

Introduction: Using the Art of Mandalas to Simplify Cyber Complexity In the highly analytical and structured field of…

14 条评论
Exploring the 'Tree of Thoughts': A Novel Prompting Technique for AI Problem-Solving

2023年6月1日

Exploring the 'Tree of Thoughts': A Novel Prompting Technique for AI Problem-Solving

Reading “Tree of Thoughts: Deliberate Problem Solving with Large Language Models,” prompted (pardon the pun) me to…

4 条评论
Inner Light and the Cosmos: Quantum Entanglement, Holography, and the Nature of Consciousness

2023年5月7日

Inner Light and the Cosmos: Quantum Entanglement, Holography, and the Nature of Consciousness

Introduction The concept of "inner light" has intrigued humanity for centuries, with various interpretations emerging…

7 条评论

See all articles

How ChatGPT's Changing Behavior Will Affect Backend Services

Junior Williams

Security Architect and AI Researcher

ChatGPT Performance is Drifting on Key Tasks

How Backend Services Rely on Stable ChatGPT Outputs

Steps Services Can Take to Adapt?

What This Means for the Future

Conclusion

领英推荐

References

Junior Williams的更多文章

社区洞察

其他会员也浏览了

ChatGPT Plus Subscription - Worth it or not?

?? ChatGPT Speaks—And It's Wild

AI tools for future focused companies

World's 1st AI Software Engineer, ChatGPT with body, Apple to get Google's Gemini, TikTok Ban..& lots more!

Using ChatGPT to Prototype Using README: A New Era of AI-Driven Development

ChatGPT one year later: Challenges and learnings

Will ChatGPT And Generative AI Revolutionise Digital Customer Experiences?

Lowering OpenAI API Costs Without Affecting Quality

Dear ChatGPT, how can Product Managers best work alongside AI in 2023?

Key Elements of a Well-Structured Prompt

ChatGPT Performance is Drifting on Key Tasks

How Backend Services Rely on Stable ChatGPT Outputs

Steps Services Can Take to Adapt?

What This Means for the Future

Conclusion

领英推荐

References

Junior Williams的更多文章

Resilience Under Pressure in High-Stakes Environments

Secure GenAI: Cybersecurity in the Era of Generative AI

CTEM — The Future of Proactive Cybersecurity

Responsible AI Implementation in Enterprise and Public Sector

Voice Cloning Conundrum: Navigating Deepfakes in Synthetic Media

Maximizing Cybersecurity with AI: A Comprehensive Guide to Applications, Strategy, Ethics, and the Future

Surprising Findings on the Power of Quirky AI Prompts

Finding Balance in Cyber: The Mandalas Approach

Exploring the 'Tree of Thoughts': A Novel Prompting Technique for AI Problem-Solving

Inner Light and the Cosmos: Quantum Entanglement, Holography, and the Nature of Consciousness

社区洞察

其他会员也浏览了

ChatGPT Plus Subscription - Worth it or not?

?? ChatGPT Speaks—And It's Wild

AI tools for future focused companies

World's 1st AI Software Engineer, ChatGPT with body, Apple to get Google's Gemini, TikTok Ban..& lots more!

Using ChatGPT to Prototype Using README: A New Era of AI-Driven Development

ChatGPT one year later: Challenges and learnings

Will ChatGPT And Generative AI Revolutionise Digital Customer Experiences?

Lowering OpenAI API Costs Without Affecting Quality

Dear ChatGPT, how can Product Managers best work alongside AI in 2023?

Key Elements of a Well-Structured Prompt