AWS DeepRacer Models For Beginners

AWS DeepRacer Models For Beginners

Introduction

In the?previous part, we have presented a brief description of cloud computing, machine learning, and reinforcement learning with AWS DeepRacer. We have learned about the specifications and theoretical knowledge to build a model, select the hyperparameters, and write a reward function using the environment parameters.

This part will present a step-by-step manual on how to create, train, and evaluate a model on AWS DeepRacer. Moreover, we will explain how to read and interpret the performance of a model during the training and evaluation phases.?

Create a model?

The first step to start with reinforcement learning on DeepRacer is creating a model. To start, we go to the?AWS Console?and type DeepRacer on the search bar as follows:?

No alt text provided for this image

From the DeepRacer console, select “Create model”.

No alt text provided for this image

Another option is to use the side menu bar in the DeepRacer console and select "Your mod”, and then select "Create model".

No alt text provided for this image

Step 1: Specifying the model name and environment

On the "Create model" page, we have to enter a name for the model under the training details. We can also add training job descriptions, but it is optional. Visit the?Tagging page?to learn more about tags.

No alt text provided for this image

The next part is to select a racing track as a training environment. A training environment specifies the conditions that the agent will be trained with.?There are many shapes of tracks that can be used as a training environment which varies in complexity. As beginners, we can start with a simple track that consisted of basic shapes and smooth turns. We can gradually increase the track complexity when we become more familiar with DeepRacer.

No alt text provided for this image

Step 2: Choosing race type and training algorithm

After naming the model, we need to select a race type that we will train the model upon. AWS offers three different types of races. The "Time trial" is the easiest race type where it only considers completing the track in the least amount of time. In the "Object avoidance" race, we aim to complete the track while avoiding random static objects placed along the track. Finally, "Head-to-head" racing is the most challenging where we will face moving objects along the race, which are other players racing on the same track.

No alt text provided for this image

We will consider the time trail race as the selected option for the remaining instructions. Once the race type is selected, we need to choose the training algorithm. DeepRacer provides two different types of training algorithms, Proximal Policy Optimization (PPO) and Soft Actor-Critic (SAC). We can select PPO algorithms for continuous or discrete action spaces, while SAC is only for continuous action spaces. You can learn more about the training algorithms from DeepRacer documentation.?

No alt text provided for this image

Besides the training algorithms, we need to select the hyperparameters (see this link for more details about hyperparameters). You can watch this?video?to learn more about how to choose a proper hyperparameter for our model.

No alt text provided for this image

Step 3: Defining Action Space

Action space specifies the actions the agent can take inside the environment. For DeepRacer, it means in the range of the steering angle and speed of the vehicle. There are two types of action spaces, discrete and continuous. The discrete action space allows us to select a discrete number of angles that the agent can steer and speeds the car can aim for at a single state. The final action space is the steering angles, and speed of every possible action based on the values specified earlier. The following figure shows six possible actions that the agent can take at a single state of the car.

No alt text provided for this image

Unlike discrete action space, continuous action space does not have a discrete number of actions. We can only specify the minimum and maximum for both the steering angle and the speed. The following figure shows how does the continuous action space look.

No alt text provided for this image

Step 4: Choose vehicle

Here we can specify the vehicle shell and sensor configuration that will run on the track. DeepRacer provides two agents by default with different specifications. We can also create our own custom agents and set the configuration, such as the camera type and adding the LIDAR sensor. The original DeepRacer vehicle was selected for the model, as shown in the following figure.?

No alt text provided for this image

Step 5: Customising reward function

The last step in creating a model is to choose a reward function and training time. AWS provides some simple examples of reward functions. We can either select a pre-existing reward function from the examples and modify it or create our own reward function from scratch.

No alt text provided for this image

Finally, depending on the track and reward function, we need to choose a perfect training time.?

No alt text provided for this image

Also, we select the maximum time we want our model to train. It helps to monitor the cost of the training by setting a stopping time. Usually, the more a model is trained, the better it performs. But it can also cause overfitting where the model is trained in a level that can not be generalized in testing environments. Therefore, selecting the proper training time is crucial for the model’s performance. You can learn more about overfitting and underfitting?in this link.?

No alt text provided for this image

Finally, before we press on “create model”, there is an option that allows submitting the model in the DeepRacer league automatically after completion of training. After confirming the choices, we can click the create model. AWS will start building the model and start the training process for the selected amount of time.

Training Analysis

AWS DeepRacer utilises?SageMaker?to train the model behind the scenes and leverages ?RoboMaker?to simulate the agent's interaction with the environment.?

Once you've submitted your training task, please wait for it to be initialised and then executed. To change the state from "Initialising" to "In Progress", the initialisation process takes roughly 6 minutes.

No alt text provided for this image

To track the progress of your training job, look at the Reward graph and Simulation video stream as shown in the below figure.

No alt text provided for this image

You can refresh the Reward graph by using the refresh button next to it until the training job is completed. In the chart generated, it shows three lines:

  • Average reward
  • Average percentage completion (Training)
  • Average percentage completion (Evaluating)

Each of these metrics is an indicator of how the model is performing during the training phase. You can learn more about the reward graph in this video.

The following video shows a training demo where it shows how the model is learning over time by trial-and-error at each run.??

Evaluation?

Once the training gets completed, it allows you to evaluate the model. When the training task is finished, a model is ready. If the training isn't finished, the model can be in a Ready state if trained up to the point of failure.

Step 1:?Click on Start evaluation as shown below.

No alt text provided for this image

Step 2:?Choose a track under Evaluation criteria on the Evaluate model page's Evaluate criteria section. You can choose a track to evaluate that you have used while training a model. You can evaluate your model on any track. However, selecting the most similar track to the one used in training will yield the best results.

No alt text provided for this image

Step 3:?Choose the racing type you used to train the model under Race type on the Evaluate model page.

No alt text provided for this image

Step 4:?Turn off the Submit model after evaluation option on the Evaluate model page under Virtual Race Submission for your initial model. Leave this option enabled later if you want to participate in a racing event. Then choose Start evaluation on the Evaluate model page to begin generating and initialising the evaluation job. It takes roughly 3 minutes to complete the startup process.

No alt text provided for this image

Step 5:?The evaluation outcome, including the trial time and track completion rate, are displayed under "Evaluation" after each trial as the evaluation advances. You may watch how the agent performs on the chosen track in the simulation video stream window. For the particular evaluation below, each trial was finished because the car was off track. This means that we need to have a better training model to be able to complete the race.

No alt text provided for this image

The following video shows a demo of the evaluation process.?As you can see all three races will finish with "off track" which means the agent didn't finish the race. This is the indication that we need a better reward function and longer training time for the agent to learn.

Recap

In this part of the series, we have followed a step-by-step guide to train and evaluate a model on AWS DeepRacer. We have learned how to select the most appropriate option while creating, training and evaluating the model.?

What is next?

In the following part of the series, we will explain four different reward functions in detail. We will show how we logically build a model and select the hyperparameters.

Acknowledgement

This article is prepared by students at AWS Academy@Western Sydney University.

要查看或添加评论,请登录

Bahman Javadi的更多文章

  • Samples of Reward Functions for AWS DeepRacer

    Samples of Reward Functions for AWS DeepRacer

    Introduction AWS DeepRacer is a 1/18th scale self-driving racing car that can be trained with reinforcement learning…

    2 条评论
  • Introduction to AWS DeepRacer

    Introduction to AWS DeepRacer

    Introduction This series of articles will provide an introduction to basic concepts of an autonomous car driven by…

    2 条评论

社区洞察

其他会员也浏览了