Baffled by autonomous vehicles: Here is a 101 on vehicle perception and situational awareness
Walid Negm
Engineering amazing things | Nothing ventured, nothing gained - GenAI, Automotive Software, Cloud-Native & Open Source
For experts in the room, please forgive me...
First, consumer understanding of driver assistance needs to get better
The end-user (me in particular) still remains a bit confused about what is available in driving assistance in any particular vehicle, why we need it, how to turn it on. How to turn it off. Who is in control at any point in time? What does it mean to re-engage when necessary?
With that said, when it comes to driving assistance the automotive industry has made leaps and bounds. For example, vehicles do a reliable job of keeping themselves in the center of the lane, alerting a driver if there is an impending forward collision and obediently following a leading car. When you let go of the steering wheel, say in adaptive cruise control, it is expected that such a feature works close to 100% of the time... in the same way as we expect of a brake pedal. And when we look to advanced autonomous shuttle's or robo-taxi's we hear of steady progress in handling situations of doubt or uncertainty.
A vehicles perception (the interpretation of it's surroundings) plays a fundamental role in making these dynamic driving tasks happen. If a car can figure out it's surroundings then it can make an informed decision about steering, acceleration and braking. Of course when such a system encounters something that it does not recognize then things get tricky. In these situations the idea is for the vehicle to hand over control to the human driver. That's when autonomy is partial. In the case of higher levels of autonomy then the goal is to achieve driving that needs no intervention from a human driver.
Vehicle perception & so-called sensor fusion is one of many subsystems that make up automatic driving systems. Here are some basics learnt over the last couple of years.
Situational awareness means better decision making
The real-world as we perceive it is made up of information and our ability to process and make sense of it leads to better decision making. Our eyes are the windows to the world--literally---and automakers and suppliers have taken that approach to make vehicles perceive their surrounding and solve the dynamic driving task. That task spans motion planning, trajectory planning and overall a coherent and safe driving strategy.
The more we (and of course a car) can see and correctly interpret our surroundings, the better we can anticipate what is happening. It would be even better if we could also anticipate the intention of drivers---and we do try. By observing erratic behavior of an uncooperative driver. We make our decisions through orientation, observation and analysis. Then we take a course of action.
"Decision theory is the analysis of the behavior of an individual facing nonstrategic uncertainty—that is, uncertainty that is due to what we term nature”
Today cars from the standpoint of the environment (not ecology) are like humans. They see, they hear and they process the “sensations” in the vehicle's case data like images from a camera, but also signals from radar and LIDAR that respectively sense distance of objects and give perception of the environment geometry.
Sophisticated sensors are the tip of the iceberg for an the automatic driving system that is constructed by assembling algorithms that rely on intricacies of decision theory including:
So that's the theory.
To achieve situational awareness means using decision tools in conjunction with several different signals (frequencies) to "see" and "hear" not just a stereo camera to take images. Sensor fusion is the science of tracking the environment be that cars, barriers or pedestrians walking across the road in a singular view of the world from LIDAR, camera and RADAR sensor data. Where RADAR helps in blind sport warning and collision avoidance because it measures the changing the frequency of the radar waves. Whereas LIDAR (light detection and ranging) uses a different wave length----to scan the environment using a focused laser beam. The result of the laser scan cannot measure velocity, but it can measure range.
Reconstructing a vehicle's real world environment
So perception, but also additional and sufficient information about the environment (and context) is necessary in a high fidelity. It is important for a vehicle to be able to detect and track obstacles be that pedestrians or lane marking. The vehicle then takes the facts of it's world to warn on forward collisions or when a vehicle is about to swerve into a lane.
For example the images below show parts of the process of labeling a lane---whether it is in fact a lane, a dashed line or straight line:
Here are some more algorithms that can help detect free space. Free pace around the ego-vehicle (your car) is segmented in local coordinates for any given time stamp:
How is a classic video that brings several algorithms together so a vehicle can "see"---and all objects, including the car and pedestrian are placed into what's called a 3D bounding box.
领英推荐
The mathematics behind perception and sensor fusion
This “brain” behind computer vision and sensor fusion (as well as other driving functions) are a series of algorithms be that rules, multi-physical models, state machines or machine learning models that handle one event or a family of features/events/:
All of those items are not an entire vehicle's automatic driving system, they are small subsystems. There are many subsystems including perception, sensor fusion & localization (where am I), planning a path & route, navigation and controlling the motion.
Seeing around corners
Traffic lights are blinking and there is construction---and there are unknown objects on the roadway. What do to? Is there sufficient situational awareness from a single vehicles sensors and internal perception systems? There is uncooperative drivers at the intersection. Does that make things even more uncertain?
Alone adding another sensor to the vehicle itself is not the answer to deal with unpredictable drivers or hazardous situations. High definition maps help. But it is necessary to continue to think of ways to enhance a car's situational awareness to answer some tough questions:
When the road and the car start to cooperate then you can introduce new levels of situational awareness. For example with vehicle to vehicle or infrastructure communications the following is possible:
When you can imagine messages being shared between vehicles, then you can imagine sharing the entire vehicle's perception. Everything that every car see's the other vehicles can take on as additional situational awareness. Basically other vehicle's are able to evaluate (see feel smell), their neighbors sensor data. This has to be reduced to specific data. The reason is the transmittable data amount.
A project called “IMAGinE" attempted to demonstrate how vehicles can cooperate by sharing driving maneuverers in order to optimize traffic flows. paves the way for cooperative driving. The image below shows how a vehicle that is about to merge onto the highway can exchange of information about merging cars’ planned maneuvers thereby creating a proactive opportunity to open a gap in traffic for the merging car. Misinterpretations of cooperative intentions can be prevented and critical situations avoided.?
Wishful thinking in the cloud robotics
What about having the cloud (or some additional the network itself) give a helping hand to a car's situational awareness---or more complex decision making. When can we allow a computer vision model coded up as a deep neural network to fallback when in doubt to a "cloud server" maybe due to adverse weather conditions? How do you share instantaneous the states of all neighboring vehicle position, speed and camera perception to improve object classification accuracy?
Daunting questions, but maybe sometime in the next decade we can imagine a scenario where there is some assistance being provided by the cloud---to core driving functions. It is unrealistic for safety reasons today---but maybe one day there will be a case where the trajectory creation and selection decision are offloaded to the edge devices
A key challenge is the latency to a macro cell is 10 - 20 ms. Then comes the processing in the cloud. Then you have to ship the data back and forth. So add another 10 to 20 ms. By that time the car might switch cells. So you need to track the car and reroute the message. That means the latency is too big to offload anything that will happen in the next 2s. Cloud computers have no real time requirements and there is no determinism and assurance of outcome guarantee.
Regardless of how far fetched, we wanted to explore this option where a vehicle is somewhat dependent on the cloud. Maybe an autonomous shuttle is in an airport, venue or some closed circuit environment. The 5G wireless latency is guaranteed because of the network design. The data-center is at the edge of the network--and not half way around the world. In these constraints, maybe then we can imagine that we ship sensor data from the car to a cloud for assistance.
In the image below--we developed such an "augmented autonomous driving system" where the car speed is very slow (limited to 5-10 km/h), but with other tests we could reach 20 kilometers per hour. Going faster was not possible due to the low quality of the network connection and the kind of cameras we installed.
Here are some practical examples for "cloud assisted" driving:
Thank you to:
APPENDIX: There are levels of autonomy
Growth Hacker | AI, Cloud, Edge | Advisor, Entrepreneur | Product Management, Marketing & BizDev
2 年good read Walid Negm, thanks
Center Leader ADAS Perception
2 年Thanks Walid, That is a good summary what we did in the past years in my team at Capgemini!
MSc. AI and Robotics with Advanced Research | Pega CSSA | Machine learning | Neural Network | Member of British Computer Society
2 年https://www.dhirubhai.net/posts/utpal-bhadra-84b149111_ai-artificialintelligence-synchronization-activity-6847880620866072576-zUGm