Jailbreaking ChatGPT: Ed3 World Newsletter Issue #29
This (almost) monthly newsletter serves as your bridge from the real world to the advancements of AI & web3, specifically contextualized for education.?For previous issues, check out ed3world.substack.com. All new issues will be published on both LinkedIn & Substack.
Dear Educators & Friends,
Instead of calling our current era the age of AI, how about we call it the age of heavily flawed, childlike, and hopefully the worst it will ever be, AI?
In AI’s most recent news (I’ll save Sora & Altman’s $7 trillion raise for another issue), Google’s attempt to balance the algorithmic bias of it’s image generation tool, Gemini, backfired. It was a classic story of Amelia Bedelia. If you remember this children’s series (one of my favorites as a kid), Amelia Bedelia is a house keeper who follows directions exactly as her employers give them, without using any common sense. For example, when she’s asked to dress the chicken, she puts clothes on the chicken.
Google’s Gemini had an Amelia Bedelia moment. Gemini was programmed to include diverse images representing various backgrounds & cultures. Instead of picking and choosing when that diversity would be appropriate, it made everyone black and brown, including America’s founding fathers, the pope, famous paintings of white ladies, medieval knights, Nazis… and so on. On the flipside, it didn’t turn Zulu warriors white.
This seems like a case of fat-fingered programming with not the worst of intentions. Google has since shut down Gemini and after this apology, is attempting to fix it.
Now this is not the only reason AI is much like Amelia Bedelia. In fact, AI is just as na?ve in many ways. In 2023, if you wanted ChatGPT to do something outside of it’s safety policies, you could ask it to pretend or imagine it was someone else. You could tell it to be an unethical human and it would break all the rules. The internet calls this “jailbreaking” ChatGPT. OpenAI has since updated it’s systems to protect against this type of prompt, but new manipulations of the app have been found.
Recent experiments have discovered that ChatGPT may* perform more accurately and/or may break it’s own rules when:
(*I say “may” because OpenAI is constantly updating it’s algorithms & some prompts may no longer work.)
Why is AI so easy to manipulate? Why does it value kindness and monetary incentives?
My guess is, the data it’s collecting from the internet… all the human inputs of the last 30 years… is painting a picture of what humans value, how we operate, and how we’re motivated. It’s not making meaning of any of the inputs it’s receiving, it’s just pattern matching and trying to produce the most optimized human response.
And some of the patterns it’s likely identified are:
?? Humans are swayed by humanity and kindness (whew, this one’s a good one!)
领英推荐
?? Humans are swayed by monetary incentives & recognition of work
?? Humans are easy to manipulate
In fact, after I wrote that list based on jailbreaking techniques, I figured I’d ask ChatGPT for a list and this was the result. Not far off.
AI is currently the best source of the human condition because it tells you exactly what it observes, like a ??Sheldon Cooper bot.
Ultimately, I think this will be the greatest value of AI; To understand ourselves better, to understand our growth areas, and maybe even with the help of AI, to become better humans. Sure, lesson planning is easy with ChatGPT but what if we can save ourselves from ourselves with it?
In the meantime, let’s learn about it’s outer limits. Happy jailbreaking.
Warmly yours,
Vriti
Learn about Manipulating AI
???Ed3?Events
I’m Vriti and I lead two organizations in service of web3 & education. Ed3 DAO is a community for educators to learn about emerging technologies. k20 Educators builds metaverse worlds for learning.
This month’s Ed3 World?newsletter is about Bitcoin. More to come on other web3 & emerging tech topics. We’re excited to help you leverage the web3 opportunities in this new digital world!
Senior Managing Director
7 个月Vriti Saraf Very Informative. Thank you for sharing.
Fascinating analogy! Looking forward to exploring the complexities of AI manipulation and the ethical implications discussed in your latest issue.
Innovation Curriculum Specialist for K-8 Deerfield Public Schools District 109
7 个月the OG emerging tech newsletter writer and one of my favorites all around. ??????
Next Trend Realty LLC./ Har.com/Chester-Swanson/agent_cbswan
7 个月Thanks for posting.