Demystifying ChatGPT

Demystifying ChatGPT

Reverse engineering the OpenAI chat demo to understand the next-generation chatbot's intrinsic behavior better.

Everyone not living under a rock should be well aware of all the buzz related to OpenAI just released ChatGPT application. Its near-human response accuracy and conversation capabilities are astonishing and opened a wide range of possible applications.?

One of the critical features of ChatGPT is its ability to generate relevant and coherent responses within a given conversation. This is achieved through combining techniques, including deep learning, unsupervised learning, and fine-tuning large amounts of conversational data.

In addition to its ability to generate responses, ChatGPT also understands and interprets a conversation's context. This allows it to create appropriate responses for the situation, making it more effective at maintaining a natural and flowing conversation.

No alt text provided for this image

Overall, ChatGPT is a powerful and versatile tool that has the potential to revolutionize the way we interact with machines. Combining deep learning, unsupervised learning, and fine-tuning large amounts of data can generate human-like responses and help understand the context of a conversation. This makes it a valuable tool for many applications, from customer service to language translation.

Unfortunately, no API is available.

Almost every developer is looking for APIs to interact and integrate chatGPT into their application. Unfortunately, OpenAPI does not offer a public API to be used through their SDK or a simple HTTP interface, as the model explains upon request.

No alt text provided for this image

Using chatGPT through an SDK or directly calling remote APIs would open a whole set of applications, enabling developers to integrate its astonishing capabilities into every app.

In this article, reverse engineering of the private APIs offers a better understanding of chatGPT behavior and some insights into its functioning under the hood.

Reverse engineering the demo?app

Disclaimer: this article is for the sole purpose of providing a better understanding of chatGPT intrinsics while we wait for OpenAI to release full documentation and support through their official SDKs with relative billing. Please be careful because APIs could change without notice because they are not public, and/or your account could be banned for improper service usage. If you use the results of this analysis, do it at your own risk.

ChatGPT demo app is a web application released in HTML, CSS, and Javascript. Its code is fully minified and chunked using webpack, but with Chrome inspector is relatively easy to track remote network calls and identity some exciting facts.

No alt text provided for this image

After the login, three endpoints are invoked for every action a user does and provide support for conversation management.

Invoking these endpoints from any REST client allows interaction with the chat model, like the demo app. Much work is done through HTTP headers, which pass parameters to the stateless backend, allowing to recover session and user state. To ensure chat services are accessed through a web app, OpenAI requires to specify a user-agent header, which we have to set appropriately in our HTTP client to a valid value; otherwise, our requests will be rejected. In our case, we used Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7), AppleWebKit/537.36 (KHTML, like Gecko), Chrome/108.0.0.0 Safari/537.36 to emulate a Chrome user agent on a Mac OSX. This header must be set in any request.

Obtaining a session?token

Once the username is known after login, the first step is to obtain a session object from its relative endpoint (https://chat.openai.com/api/auth/session). This is achieved through a call that returns the following payload:

No alt text provided for this image

An access token is also returned with meaningful user information lasting one month. The token is a standard JWT token that can be decoded to extract user information, the identity provider (basically Google through Auth0), and authorized scopes for user data.

No alt text provided for this image

Please refer to the official documentation for more information about the JWT standard and its usage in the OAuth2 workflow. For our purposes, it is enough to get the token we will use in our Authorization HTTP header for the following calls.

2. Start the conversation

Using the obtained token, we can invoke the conversation endpoint (https://chat.openai.com/backend-api/conversation) with the following parameters. An interesting aspect is that the first message requires setting the action attribute as "variant" and then an array of messages.

No alt text provided for this image
No alt text provided for this image

The response to this payload is a data stream containing the whole set of text tokens building the response. Streams are sent incrementally and terminated by a [DONE] sequence.

To provide an example, the first response in the stream is?

No alt text provided for this image

While the last chunk of the stream is

No alt text provided for this image

Every chunk is sent back as a stringified JSON containing text data and a conversation_id, which helps handle the conversation follow-up.

3. Continuing conversation

Invoking the same endpoint with the "next" action and two follow-up attributes ensures the conversation context is maintained between different calls. The two fundamental parameters are conversation_id and parent_message_id. The first ensures that all the messages belong to the same conversation, while the latter provides support for message ordering.

No alt text provided for this image

One of the most exciting attributes of the payload is a model attribute that points to text-davinci-002-render, suggesting it is using the OpenAI davinci-002 model under the hood. This model has been fine-tuned to provide chatGPT-specific information and moderate results.

Moderation monitoring is also achieved through the moderation (https://chat.openai.com/backend-api/moderations) endpoint, which receives the whole chat text every time a new sentence is appended (either by the model or the user) and returns feedback about whether the conversation contains sensitive information or not.

Where to go from?here?

The release of the ChatGPT demo app gained unprecedented interest from the vast user community. Many people started discussing the power of domain-free conversation enabled by such models. However, even if no API had been released, the developer community started imagining how a possible integration could work. In such a context, we could reasonably expect the rise of conversational agents into several real-world applications shortly. In the meantime, some proofs-of-concept can be developed leveraging existing API, as discussed in this article.

要查看或添加评论,请登录

Luca Bianchi的更多文章

社区洞察

其他会员也浏览了