The Best Path a DeepRacer Can Learn
The following article has been reposted by request from the DeepRacer community and updated to include the code snippets missing from the original.
If my hypothesis was correct, teaching DeepRacer to aim for the ideal racing line could give us the edge we needed to win the 2019 championship
I joined?Myplanet?in October 2019. Shortly after I joined, we had a team compete in—and win!—the?Canadian edition of the 2019 Deep Racer?league. I didn’t know much about?DeepRacer?at the time, but I’m deeply challenge motivated and hearing what they accomplished was all the fuel I needed to stoke my competitive fires. I immediately joined my colleagues in their quest to grapple with this new and dynamic machine learning opportunity.
I learned as much as possible as quickly as possible, figuring out how Deep Racer works, getting a handle on the strategies teams have used worldwide to shave milliseconds off their times, and wrapping my head around the various models out there for improving results.
And of course, part of my research looked at how our own team had attacked the challenge. In the many successful strategies and approaches they had employed, one thing they hadn’t considered was an?ideal racing line?for training their winning models— something I thought could have a big impact.
To briefly explain what I mean, let’s look at the re:Invent 2019 championship track. The total length following the centreline is 23.12 meters, but our current best approximation to the ideal line (in red) is only 20.94 meters long. That means if we taught DeepRacer to drive the red path instead of the centreline, we should be completing laps 10% faster. In addition, our ideal racing line requires less steering than the original track, which means faster driving. In theory, 10% is not the most, but the?least?we could improve.
If my hypothesis was correct, teaching DeepRacer to aim for the ideal racing line could give us the edge we needed to win the 2019 championship.
The Ideal Racing Line
In simple terms, the ideal racing line is the fastest possible path through a circuit. There are many well-known guidelines for defining an ideal line, but in general, they aim to balance four (often conflicting) goals:
Beyond these broad guidelines, defining an ideal line is more art than science; and racing it effectively, depends on everything from the load balancing of the car, to the nature of the circuit, to the reaction time and state of mind of the driver. There are?entire books dedicated to the topic, and it takes professional drivers years of relentless practise to master all the techniques.
Knowing the ins and outs of an ideal racing line is great, but teaching a self-driving car to race it is a whole other issue. There are some formulaic solutions to fitting an ideal racing line through a circuit (Euler spirals?are perhaps the most popular) and, as?this Engineering Master’s thesis?suggests, the problem is even deserving of its own separate Machine Learning approach. But is there an easier way? Can we (pardon the pun)?cut any corners?
Finding DeepRacer’s Ideal Line
As the Technical Lead of?Myplanet Ventures, I often have to define the boundaries of technical feasibility, given our time constraints, and look for shortcuts or alternatives that can help us reach our goals quickly. But training DeepRacer takes a long time and even after finding the ideal racing line, we still need to translate it into a sensible reward function, so trying to come up with a formula, may not be the best use of our time. Besides, tracks are provided in the form of?waypoints, not functions. So I hypothesized that, if cleverly massaged and without too much extra work, these waypoints could well be turned into a sufficient approximation of an ideal racing line. Let’s get to it.
Low-Pass Filtering
One of the easiest things to notice about ideal racing lines is that they tend to minimize steering. This makes sense since steering typically requires braking, which costs time. In other words, translating the path of a circuit (as represented by its centreline) into an ideal racing line, requires the smoothing out of turns, which, in the case of slight turns, can even result in a straight path right through a curve.
There is a similar context involving the filtering out of sharp changes elsewhere in Engineering: signal processing. Filtering is heavily used in signal processing to remove unwanted noise. For example, one of the most common uses of filtering in audio processing is to remove high frequencies from an audio stream to prevent damaging larger speakers (e.g., sub-woofers), which cannot mechanically sustain the loads involved in displacing air at high speeds. This low-pass filtering is so common that there are some simple and well-known algorithm implementations of it, which can be easily applied to data series. The most common of these implementations is perhaps the first-order RC filter, which only needs the stream and a constant parameter (the RC constant) to do its magic:
Back to re:Invent 2019
Using our re:Invent 2019 championship track example, I first converted the track from?(x,y)?coordinates, to directions:
领英推荐
I know this trace doesn’t look much like a track, but interestingly enough, it?is?the track, just projected onto a different space (i.e., the directions space); together with the origin?(x,y)?coordinate, and the distance between waypoints—which happen to be constant at exactly 0.151 meters—this plot contains ALL the information we need to rebuild the track, yet it uses half the storage space. Ain’t that neat?
This conversion to the direction space is necessary to turn the track into a stream that can be processed with the code snippet above, since it would need a more complex algorithm to process directly on the track space where every waypoint is represented by two numbers?(x,y), instead of just one.
And finally, if we’re going to low-pass filter the track, we don’t want to filter the track’s directions?per se, but rather the rate of change (how fast the directions change as the path progresses). This requires a derivative, which really just means “subtract every point in the sequence from the one immediately after it”. Lo and behold, here you have the directional change plots for the re:Invent2019 championship track:
Let’s look at this for a minute. On the left, we have the raw directional change plot, and on the right, we have the same trace but filtered using an RC constant of 10. Some cool things are happening here: For one, the number of sudden directional changes has been significantly reduced. For example, the three small peaks at waypoints 100, 106, and 109 on the left have been completely replaced by a pulse that transitions smoothly from lower to higher values. Another interesting outcome is that the magnitude of the changes is significantly reduced, especially for negative values (right turns), which is consistent with the fact that the circuit requires more pronounced left than right turning. But reading this in the directional rate of change space is rather awkward. How does this look when projected back onto the original track?
Oops! This is obviously not good. Why is the filtered path off-track? And why is the end not meeting the origin? Well, numerical derivatives and filters can often introduce artifacts in the form of offset constants. Also, the filter does not?know?the input signal (the track’s direction changes) is supposed to loop around, making the last waypoint the same as the first. So we need to nudge the whole path a bit counter-clockwise, and make sure the ends meet:
Ok, this is better, but it is still far from an ideal racing line. For example, it seems like waypoint 40 should be a lot closer to the inside of the track, and the segment between 40 and 80 should just be a straight line rather than such a wide curve. However, not all is lost, interestingly enough, our filtered path savagely cuts across the first?chicane?(waypoints 10 to 40) with an almost straight line. Something similar happens for the second chicane (waypoints 110 to 140). So even though this filtering didn’t get us all the way to the ideal line, it is definitely making it easier to identify the track segments where the car should, or shouldn’t change direction, and with that sorted out, the rest?should?be a lot easier.
If this article were a movie, this is the point where a montage of the lead character going through the painstaking process of trying (and failing) over and over again would occur as I sought different ways to massage the filtered path above into something a real pilot would follow. Luckily, this isn’t a movie, so we can fast-forward the pain and jump straight to what I discovered. There are a few very important things that will help you as you embark on your own quest for the ideal racing line:
You may also find the code below quite helpful, since it will allow you to comply easily with point (4) above:
The plots below show my results to date. On the right, we have the ideal track direction plot. You can see the effect of using the function above, having replaced track segments with either a constant direction value or a linear direction change. On the left, we have the ideal path that results from projecting the edited direction plot back onto the track space.
As you can see, my editing of the direction space has resulted in 4 straight (S1 to S4) and 4 curved (C1 to C4) paths. The most important advantage, other than shortening the path, is that the original track had no straight sections, which allow the car to run at top speed. So if we find a way to teach our DeepRacer to race this new path, we should be getting even faster lap times.
So What Next?
If you are a DeepRacer league participant, you probably came here hoping for insight on how to improve your reward function. I hope that, by showing you a simple way to massage the original waypoints of a DeepRacer track, into a shorter path that should also allow for higher top speeds, I have given you a tool to improve any of your reward algorithms that focus on following a path. However, this is just the beginning. Racing an ideal line does not only require knowing?where?on the track’s width to be at any given waypoint, but also?how?to get there, and one of the best ways to make sure our DeepRacer will follow the ideal line is by helping it learn how to brake and accelerate into and out of a curve, which we haven’t covered (yet).
Furthermore, although this exercise was a good start, it is not close to optimized. For example, C3 could still be broken up into two short curves and a straight path so it can be raced even faster. S2 could also be lengthened, and so on.
And last, but not least, the linear direction changes we have used to find our path through a turn will lead to geometric racing paths, which are also not quite ideal. Rest assured, I will be fine-tuning this approach throughout the 2020 championship, sharing what I learn along the way. Stay tuned!
(And please like, share, and above all comment if you’re working on a DeepRacer model of your own and have other tactical approaches to optimizing DeepRacer lap times.)
Flutter Developer at Vilstream?
8 个月Jorge Silva How to write a reward function using the Ideal Racing Line. I successfully create the new race line but i don't know, How i write a reward functions
ML Engineer @ Marico Limited (Lobo Staffing) | AWS DeepRacer 2023 Finalist
2 年Thank you, Jorge, for all your efforts! It's an excellent article. ??