Build A Simple Web Application That Converts Natural Language Into Source Code In 10 Minutes
1. Welcome to the Transformers' Era
Transformers may reshape deeply the way applications will be developed in the coming years. In the near future, most of the low-value code produced by a developer might be generated by AI and Transformers.
A?Transformer?is a deep learning model relying on self-attention mechanism to compute representations without using sequence-aligned?Recurrent Neural Networks?(RNNs).
Recently, OpenAI released the largest model of its Generative Pre-trained Transformer called?GPT-3,?having a huge capacity of around 175 billion?machine learning?parameters. It has a remarkable capability to leverage deep learning and generate human understandable texts that could be stories, poems but also source code.
One of the main benefits of this model is that it can be used with Few Shot or Zero Shot Learning, which is considerably shortening and simplifying things with no more long data preparation and training phases.
This article is mostly focused on source code generation with some bonus tracks showing great text generation capabilities. It comes along with the entire Python source code used to build this playground web app.
As of today, OpenAI GPT-3 DaVinci API is still in a beta version so the code generation capabilities might make some progress in the coming months.
2. Overview of the web application capabilities
With this article, you will be able to build and run a simple web application implementing powerful code generation capabilities based on GPT-3, with just a few lines of python code. Basically, the application is a playground allowing the user to type a simple request in natural language, then see the source code generated. The application is integrated with the GPT-3 API that manages the code generation or the text completion.
It may be every developer's dream if they could ask the following and get the generated code:
"Please generate a Python code that predicts the salary with criteria like age, position, experience, using Random Forest algorithm"
or the following: "Please generate a Java class named VIPPerson used to represent a person with name, age and gender attributes".
A preset list of typical requests has been prepared in order to assist the user. Each one can be adapted, so the user can click on a button to generate the code and see the results.
The technology stack used is the following:
Here are some capabilities of the application you will be able to build in the following sections.
The use case is very simple: The user can select a Preset of a Request Type from the dropdown menu. Then a text area field is filled with a standard Request written in natural language. The Request can be adapted by the user. Once done, the user can click on the Generate button then the corresponding source code is generated and displayed.
Code generation: Generate an SQL query
Below a simple request in natural language describing the fields and the tables, as well as the data selection. The result is an SQL Query which complies with the PostgreSQL syntax.
The user can change the proposed request in the text area field to adapt it to his own needs.
Code generation: Create an AWS CloudWatch Alarm with a specific threshold
As you may know, AWS Cloud can be entirely configured with code. An SDK called Boto is used here to generate a CloudWatch Alarm that is triggered when the CPU exceeds 70%.
Code generation: Predict the salary with criteria like age, position, experience, using Random Forest algorithm
This request allows to generate a complete code to predict a salary from several criteria. The result is a Python code that uses RandomForestRegressor to achieve this. A graph is also plotted using Pyplot library.
Code generation: Generate a Java Class that represents a Person with name, age and gender attributes
This is a very basic Java Class generated with few attributes. The fact that the description starts with a Java comment and a Class name helps the GPT-3 engine to complete the code with better accuracy.
Code generation: Developer Task List
The request can take the form of a task list. The default generated language is Python but this also can be specified to transform it to any other common language.
Text processing: Summarize a text
The 2 next presets demonstrate some powerful text generation capabilities. This one allows to summarize a text. Some parameters in the code can be adjusted to determine how short the result is.
Text processing: Rephrase a text for a 10 years old kid
Another powerful text generation feature. This request asks the GPT-3 engine to rephrase a text so a 10 years old kid can understand it.
3. Quick Start
The following will allow you to edit and run the code present on my GitLab. Before doing that, you'll need to register at OpenAI to get your API key.
领英推荐
The application should look like the following:
4. Understand the code
4.1. GPT-3 Initialization
The first section of the code installs the OpenAI GPT-3 dependencies. The API Key needs to be replaced with the one you've received when you signed up on their website.
4.2. The Presets
This application comes up with 10 presets of requests in natural language. 8 for code source generation and 2 related to text completion.
Now let's describe the first preset that generates an SQL Query. Basically, the function below receives a request in natural language, calls the GPT-3 API with specific parameters, then returns the result which will be displayed on the screen. (here an SQL Query) The other presets are similar.
This preset uses the following parameters:
The response (generated code or completed text) is returned to the caller.
Each preset has its own parameters settings. This code was duplicated for each preset to ensure better clarity and parameters decoupling.
4.3. The Playground Web Application
This web application is based on Dash framework which allows you to build and run web applications in Python. More specifically, Jupyter-dash library has the advantage of executing web apps directly from your Jupyter Colab notebook.
Dash documentation can be found here .
4.3.1. Dependencies import
This section imports all the Dash dependencies. Please note that the 2.0.0 version of dash is imported due to a bug on the last version. It might have been resolved since so don't hesitate to remove the reference to the 2.0.0 version in the future.
Dash provides 2 types of components:
Input, Output and State dependencies will allow us to manage the callback triggered by events on the displayed fields.
4.3.2. The Playground Web Page
In this section, a simple web page layout is built, containing mainly a title and 4 elements:
4.3.3. Preset Dropdown Callback
This part of the code is called when the user selects a preset from the dropdown field. Basically, the function receives the Preset ID from the dropdown-preset field as an Input and returns a pre-formatted request to the textarea-query field as an Output.
A second function returns a field called query, which is a pre-formatted request depending on the selected Preset ID.
Again, the intent is to simplify the code as much as possible, so the reader can have a better understanding of it, but I'm sure you will be able to apply some code best practices later, once you get the logic.
4.3.4. The 'Generate' Button Callback
This part of the code is called when the user pushes the 'Generate' button. Basically, the Preset function to be called is built dynamically with the Preset ID or Preset number. Then, the GPT-3 API is called with the request that comes from the text area field and the specific parameters set for this preset. Finally, the generated code or the completed text is displayed with a specific output field. (div-output-results2)
Please note that there are also 2 additional State in in this callback. It means that this callback will not be triggered if the textarea-query or the dropdown-preset value is changed. But their value will be passed to this callback.
5. The Final Word
Thank you for reading my article to the end, hope you enjoyed it!
As you can see, GPT is very promising and can be implemented easily with a few lines of Python code. Dash might help to reach the 10 minutes' goal.
You can find the entire source code on my GitLab: GPT-3-Dash-Playground
Offer SADEY thx for the dash shoutout :-)