AWS DeepRacer Tips and Tricks: How to build a powerful rewards function with AWS Lambda and Photoshop
Wong Chun Yin,Cyrus (黃俊彥)
AWS ML Hero + Microsoft Azure AI MVP + Google Developer Experts - GCP & AI/ML(GenAI)
IVE DeepRacer team is a group of year 1 students from Higher Diploma in Cloud and Data Centre Administration and we were awarded the Champion Peter, 1st Runner-up Eddie, and 5th Bo Zuo Li, 8th and 10th place awards. I am sure no one think they can get any prize before because all of them are just year 1 non-degree students!
One of the most painful part of training DeepRacer model is to re-implement everything that can be done easily with Python library i.e. Geometry because you cannot import any Python library for the reward function.
It is undifferentiated heavy lifting and don't reinvent the wheel!
Now, let us tell you our secret weapon during AWS Hong Kong DeepRacer League!
Our trick is to move the reward function logic to another applications and we can use all powerful Python library!
Deepracerrewardfunctionapi is a AWS SAM Application with APIGatway and a Lambda function. With AWS Lambda, you can nearly do anything you like inside your reward function. The reward function in DeepRacer console is just work as a proxy to send the data to a Lambda function. A Lambda layer contains all powerful Python Library including NumPy, sympy, sklearn, and Pillow.
Before the training, we use Photoshop to draw the target path and add it into the Lambda function source folder and here is the reward function in DeepRacOur reward function.
Our reward function.
import urllib.request
import urllib.parse
import json
def reward_function(params):
url = 'https://XXXXX.execute-api.us-east-1.amazonaws.com/Prod/reward/'
query_string = urllib.parse.urlencode({"json":json.dumps(params)})
url = url + "?" + query_string
with urllib.request.urlopen( url ) as response:
response_text = response.read().decode('utf-8')
result = json.loads(response_text)
return float(result["reward"])
The Most POWERFUL Reward Function
First, we use Pillow to load the path map Image.
Convert the point from simulation world to image world system, and in above image the black dot represents the position of DeepRacer and it is not next to way point 15.
params = {
'all_wheels_on_track': True,
'x': 7,
'y': 1,
'distance_from_center': 0,
'heading': 60,
'progress': 0,
'steps': 1,
'speed': 0.5,
'steering_angle': 6,
'track_width': 0.2,
'waypoints': waypoints,
'closest_waypoints': [0, 1],
'is_left_of_center': True,
'is_reversed': False,
}
In the map, crop a circle around the DeepRacer,
and rotate the image to make x-axis becomes the heading direction.
Extract RGB color points and return the failure reward if the number of color point is less than threshold. Use Linear Regression from sklearn and get a regression line over all color points.
heading is x-axes, black line is the regression line and orange line represent the current steering angle. Convert the slop of regression line into degree and it is the target direction.
Now, converts all data into sympy Geometry object and all of the calculation becomes very simple. Get the angle between target direction and current steering angle line or the angle between the black line and orange line. Positive means needs to turn left, and negative means needs to turn right.
Reward consists of 3 components:
- Distance Reward – it uses NumPy and sympy to calculate the perpendicular distance of DeepRacer from the Regression Line and reduce the value over Gaussian distribution as smoothing.
- Speed Reward - speed / max_speed * 100
- Track Reward
Track reward consists of 3 color rules. Green is flavour for straight forward, Blue penalties right turn, and Red promotes tune left.
Final reward gives addition marks according to progress.
Please note that this is not the our actual re-ward function as we just want to share the trick but not the model!
Degree or not doesn't matter and you can be strong in AWS from any background!
Everyone can be hard work and be creative, then becomes a winner!
Remark:
Beware of the Lambda Deployment package size limit is 250 MB (unzipped, including layers) and if you need to add more library, then you just dockerize your code and wrap it inside an Fargate web application with CDK.
About IVE DeepRacer Team
Software Engg Co-op at TeraDAR || MS in Robotics @ Northeastern
4 年How to get the map of the track with coordinate points?
Customer Engineer at Google, Infrastructure Modernization.
5 年Is the is_reversed still a valid parameter?? ?Building my first model, but I don't see that in the documentation list???
UI and Multimedia Designer
5 年Thanks for your sharing
Cloud Architect & Advisor | FinOps Professional/Ambassador | AWS Certification SME | CCSP
5 年well, u could have used paint...? ??