A Weird Blend of English and Code: GPT-4o JSON Mode
Steven Reubenstone
Software Engineer, Solutions Engineer, Dev Advocate, Educator, Product Manager, Prompt Engineer
Some background info
My team and I have been working on a new app, called UpskillHero. The app connects people located nearby to solve fun and provocative puzzles together. The app also makes personalized upskilling recommendations based on how users participate in puzzles. A lot of the inspiration for this project came from the learning philosophy I speak about in my educational/fantasy novel, The Nestomir, where a young boy meets an alien warrior who introduces him to a new way of learning by doing.
Part of what makes the app exciting is the use of large language models (LLMs) like GPT-4o. Integrating LLMs has transformed how we generate content in discussion threads and has provided us with sophisticated natural language processing (NLP) capabilities through OpenAI’s endpoints. We’ve spent a lot of time training and iterating with various prompts (instructions we give the AI) to educate the AI about our users, allowing it to develop content for users in highly customizable ways.
Our LLM implementation also empowers characters from The Nestomir, who serve as coaches in our puzzle-solving game. Characters like Dendro, Zena, Zimmer, and Jake each have unique personalities and ways of engaging with users. For example, Zena responds like a Socratic tutor, Dendro suggests practical learning projects, and Zimmer offers unconventional (and sometimes comical) career advice.
The Wonders of JSON Mode
What I wanted to showcase in slightly more detail today is the fascinating blend of what happens when we mix code and English—specifically, the use of "JSON mode" in GPT-4o. And so it's clear, JSON (JavaScript Object Notation) is a way to structure data so it can be easily read and used by computers and computer programs.
This JSON Mode feature ensures that the AI's responses stick to a specific JSON format.
Before JSON Mode, we could get the LLM to respond with JSON by providing clear instructions in our prompts, like “PLEASE RESPOND IN THIS SPECIFIC JSON FORMAT WITH THESE KEYS.” But sometimes, the LLM would still go off track.
With JSON mode, the responses are consistently formatted as JSON objects. However, an interesting twist happens when JSON mode is on. Any substructure, like headers or subsections, often gets interpreted as JSON objects. For example, if I give an instruction like “Please create sub-sections and delimit them with ** (plain-text English formatting),” the AI might turn these into JSON keys, even though that’s not what I wanted.
This behavior can seem bizarre but also fascinating when you realize that these specific sub-headers are for backend use. The user never sees them; they just get fed back into another prompt. Essentially, OpenAI can interpret the JSON as English, blurring the lines between the two. For instance, if I ask for user notes with subheadings, even though I formatted them for clear reading, the AI might convert them into JSON keys. However, when another AI prompt reads these notes, OpenAI's system understands the structure and content perfectly fine, as if it were in plain English.
领英推荐
To illustrate this, imagine you have a JSON object with a key "User Notes" containing text meant for internal processing. Instead of:
{?
"User Notes": "**About This User**: Steven is a developer. **Users That Have Helped This User**: @John, @Alice."
}
The AI might generate:
{?
"User Notes": {? ?
"About This User": "Steven is a developer.",? ?
"Users That Have Helped This User": "@John, @Alice."?
}
}
While this wasn’t the explicit instruction, it doesn’t affect the end result since it’s processed internally and never seen by the user. The AI can interpret this structured data just as it would plain English, making the formatting inconsequential.
This seamless mixing of code (JSON) and English is a fascinating aspect of working with LLMs, especially in our startup. It creates a unique art form where the boundaries between language and code blur, enabling more dynamic and adaptable interactions. This fusion of English and JSON exemplifies the innovative possibilities at the intersection of NLP and software development, paving the way for more intuitive and personalized educational tools.
In conclusion, working with LLMs like GPT-4o isn’t just about leveraging cutting-edge technology; it’s about embracing the artistry in blending language and code to create engaging and effective user experiences. I am also very curious how this blend of English and Code plays out in the future!
Steve
Get on my substack!
Software Engineer, Solutions Engineer, Dev Advocate, Educator, Product Manager, Prompt Engineer
9 个月check out this post