Build A Simple Web Application That Converts Natural Language Into Source Code In 10 Minutes
Source: iStockphoto.com

Build A Simple Web Application That Converts Natural Language Into Source Code In 10 Minutes

1. Welcome to the Transformers' Era

Transformers may reshape deeply the way applications will be developed in the coming years. In the near future, most of the low-value code produced by a developer might be generated by AI and Transformers.

A?Transformer?is a deep learning model relying on self-attention mechanism to compute representations without using sequence-aligned?Recurrent Neural Networks?(RNNs).

Recently, OpenAI released the largest model of its Generative Pre-trained Transformer called?GPT-3,?having a huge capacity of around 175 billion?machine learning?parameters. It has a remarkable capability to leverage deep learning and generate human understandable texts that could be stories, poems but also source code.

One of the main benefits of this model is that it can be used with Few Shot or Zero Shot Learning, which is considerably shortening and simplifying things with no more long data preparation and training phases.

This article is mostly focused on source code generation with some bonus tracks showing great text generation capabilities. It comes along with the entire Python source code used to build this playground web app.

As of today, OpenAI GPT-3 DaVinci API is still in a beta version so the code generation capabilities might make some progress in the coming months.


2. Overview of the web application capabilities

With this article, you will be able to build and run a simple web application implementing powerful code generation capabilities based on GPT-3, with just a few lines of python code. Basically, the application is a playground allowing the user to type a simple request in natural language, then see the source code generated. The application is integrated with the GPT-3 API that manages the code generation or the text completion.

It may be every developer's dream if they could ask the following and get the generated code:

"Please generate a Python code that predicts the salary with criteria like age, position, experience, using Random Forest algorithm"

or the following: "Please generate a Java class named VIPPerson used to represent a person with name, age and gender attributes".

A preset list of typical requests has been prepared in order to assist the user. Each one can be adapted, so the user can click on a button to generate the code and see the results.

The technology stack used is the following:

  • Python
  • Dash (for the Python web application)
  • GPT-3 API (OpenAI)

Here are some capabilities of the application you will be able to build in the following sections.

The use case is very simple: The user can select a Preset of a Request Type from the dropdown menu. Then a text area field is filled with a standard Request written in natural language. The Request can be adapted by the user. Once done, the user can click on the Generate button then the corresponding source code is generated and displayed.

Code generation: Generate an SQL query

Below a simple request in natural language describing the fields and the tables, as well as the data selection. The result is an SQL Query which complies with the PostgreSQL syntax.

The user can change the proposed request in the text area field to adapt it to his own needs.

Aucun texte alternatif pour cette image


Code generation: Create an AWS CloudWatch Alarm with a specific threshold

As you may know, AWS Cloud can be entirely configured with code. An SDK called Boto is used here to generate a CloudWatch Alarm that is triggered when the CPU exceeds 70%.

Aucun texte alternatif pour cette image


Code generation: Predict the salary with criteria like age, position, experience, using Random Forest algorithm

This request allows to generate a complete code to predict a salary from several criteria. The result is a Python code that uses RandomForestRegressor to achieve this. A graph is also plotted using Pyplot library.

Aucun texte alternatif pour cette image


Code generation: Generate a Java Class that represents a Person with name, age and gender attributes

This is a very basic Java Class generated with few attributes. The fact that the description starts with a Java comment and a Class name helps the GPT-3 engine to complete the code with better accuracy.

Aucun texte alternatif pour cette image


Code generation: Developer Task List

The request can take the form of a task list. The default generated language is Python but this also can be specified to transform it to any other common language.

Aucun texte alternatif pour cette image


Text processing: Summarize a text

The 2 next presets demonstrate some powerful text generation capabilities. This one allows to summarize a text. Some parameters in the code can be adjusted to determine how short the result is.

Aucun texte alternatif pour cette image


Text processing: Rephrase a text for a 10 years old kid

Another powerful text generation feature. This request asks the GPT-3 engine to rephrase a text so a 10 years old kid can understand it.

Aucun texte alternatif pour cette image



3. Quick Start

The following will allow you to edit and run the code present on my GitLab. Before doing that, you'll need to register at OpenAI to get your API key.

  • Go to the OpenAI website and sign up to get your API Key
  • Go to my GitLab project named GPT-3-Dash-Playground
  • Edit the Jupyter Python notebook with Google Colab
  • Replace the openai.api_key value in the code with your api key
  • Select 'Run All' in the 'Execution' menu
  • A WebApp should start in a few seconds. Click on the URL generated in the bottom of the page to display it

The application should look like the following:

Aucun texte alternatif pour cette image


4. Understand the code

4.1. GPT-3 Initialization

The first section of the code installs the OpenAI GPT-3 dependencies. The API Key needs to be replaced with the one you've received when you signed up on their website.

Aucun texte alternatif pour cette image

4.2. The Presets

This application comes up with 10 presets of requests in natural language. 8 for code source generation and 2 related to text completion.

Now let's describe the first preset that generates an SQL Query. Basically, the function below receives a request in natural language, calls the GPT-3 API with specific parameters, then returns the result which will be displayed on the screen. (here an SQL Query) The other presets are similar.

This preset uses the following parameters:

  • engine: The GPT-3 engine, or model, which will generate the completion. Some engines are suitable for natural language tasks, others specialize in code. code-davinci-001 means that the most advanced code generation engine will be used. You can find additional information here .
  • prompt: The query to be completed in natural language. i.e. ### generate an SQL query that lists the employees of a company that are between 20 and 30 years old. (This is a very simple one but the preset itself is much more complex)
  • temperature: The temperature controls the randomness of the answer. 0.0 is the most deterministic and repetitive value, which remains the best choice for code generation.
  • max_tokens: The maximum number of tokens to generate
  • top_p: Controls diversity via nucleus sampling. 0.5 means all of all likeliwood-weighted options are considered. Setting it to 1.0 might give better results for code generation.
  • frequency_penality: Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. This parameter is set to 0.0 for our needs.
  • presence_penality: Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. This parameter is set to 0.0 for our needs.
  • stop: Up to 4 sequences where the API will stop generating further tokens. The returned text will not contain the stop sequence. In this preset, stop sequences are set to "#", ";".


Aucun texte alternatif pour cette image

The response (generated code or completed text) is returned to the caller.

Each preset has its own parameters settings. This code was duplicated for each preset to ensure better clarity and parameters decoupling.


4.3. The Playground Web Application

This web application is based on Dash framework which allows you to build and run web applications in Python. More specifically, Jupyter-dash library has the advantage of executing web apps directly from your Jupyter Colab notebook.

Dash documentation can be found here .

4.3.1. Dependencies import

This section imports all the Dash dependencies. Please note that the 2.0.0 version of dash is imported due to a bug on the last version. It might have been resolved since so don't hesitate to remove the reference to the 2.0.0 version in the future.

Dash provides 2 types of components:

  • Core (built-in) components
  • HTML components

Input, Output and State dependencies will allow us to manage the callback triggered by events on the displayed fields.

Aucun texte alternatif pour cette image


4.3.2. The Playground Web Page

In this section, a simple web page layout is built, containing mainly a title and 4 elements:

  • dropdown-preset: A Dropdown field with the 10 presets list
  • textarea-query: A Text Area field that will be set by default with the selected preset request. Once the preset is selected, the user will be able to change the proposed request.
  • button-generate: An HTML Button that will trigger the call to the GPT-3 API and get the response
  • div-output-result2: A pre-formatted HTML output field to display the result that could be a generated code or a completed text.

Aucun texte alternatif pour cette image


4.3.3. Preset Dropdown Callback

This part of the code is called when the user selects a preset from the dropdown field. Basically, the function receives the Preset ID from the dropdown-preset field as an Input and returns a pre-formatted request to the textarea-query field as an Output.

A second function returns a field called query, which is a pre-formatted request depending on the selected Preset ID.

Again, the intent is to simplify the code as much as possible, so the reader can have a better understanding of it, but I'm sure you will be able to apply some code best practices later, once you get the logic.

Aucun texte alternatif pour cette image


4.3.4. The 'Generate' Button Callback

This part of the code is called when the user pushes the 'Generate' button. Basically, the Preset function to be called is built dynamically with the Preset ID or Preset number. Then, the GPT-3 API is called with the request that comes from the text area field and the specific parameters set for this preset. Finally, the generated code or the completed text is displayed with a specific output field. (div-output-results2)

Please note that there are also 2 additional State in in this callback. It means that this callback will not be triggered if the textarea-query or the dropdown-preset value is changed. But their value will be passed to this callback.

Aucun texte alternatif pour cette image


5. The Final Word

Thank you for reading my article to the end, hope you enjoyed it!

As you can see, GPT is very promising and can be implemented easily with a few lines of Python code. Dash might help to reach the 10 minutes' goal.

You can find the entire source code on my GitLab: GPT-3-Dash-Playground


要查看或添加评论,请登录

Offer SADEY的更多文章

社区洞察

其他会员也浏览了