Baffled by autonomous vehicles: Here is a 101 on vehicle perception and situational awareness

Baffled by autonomous vehicles: Here is a 101 on vehicle perception and situational awareness


For experts in the room, please forgive me...

First, consumer understanding of driver assistance needs to get better

The end-user (me in particular) still remains a bit confused about what is available in driving assistance in any particular vehicle, why we need it, how to turn it on. How to turn it off. Who is in control at any point in time? What does it mean to re-engage when necessary?

With that said, when it comes to driving assistance the automotive industry has made leaps and bounds. For example, vehicles do a reliable job of keeping themselves in the center of the lane, alerting a driver if there is an impending forward collision and obediently following a leading car. When you let go of the steering wheel, say in adaptive cruise control, it is expected that such a feature works close to 100% of the time... in the same way as we expect of a brake pedal. And when we look to advanced autonomous shuttle's or robo-taxi's we hear of steady progress in handling situations of doubt or uncertainty.

A vehicles perception (the interpretation of it's surroundings) plays a fundamental role in making these dynamic driving tasks happen. If a car can figure out it's surroundings then it can make an informed decision about steering, acceleration and braking. Of course when such a system encounters something that it does not recognize then things get tricky. In these situations the idea is for the vehicle to hand over control to the human driver. That's when autonomy is partial. In the case of higher levels of autonomy then the goal is to achieve driving that needs no intervention from a human driver.

Vehicle perception & so-called sensor fusion is one of many subsystems that make up automatic driving systems. Here are some basics learnt over the last couple of years.

Situational awareness means better decision making

The real-world as we perceive it is made up of information and our ability to process and make sense of it leads to better decision making. Our eyes are the windows to the world--literally---and automakers and suppliers have taken that approach to make vehicles perceive their surrounding and solve the dynamic driving task. That task spans motion planning, trajectory planning and overall a coherent and safe driving strategy.

The more we (and of course a car) can see and correctly interpret our surroundings, the better we can anticipate what is happening. It would be even better if we could also anticipate the intention of drivers---and we do try. By observing erratic behavior of an uncooperative driver. We make our decisions through orientation, observation and analysis. Then we take a course of action.

"Decision theory is the analysis of the behavior of an individual facing nonstrategic uncertainty—that is, uncertainty that is due to what we term nature”

Today cars from the standpoint of the environment (not ecology) are like humans. They see, they hear and they process the “sensations” in the vehicle's case data like images from a camera, but also signals from radar and LIDAR that respectively sense distance of objects and give perception of the environment geometry.

Sophisticated sensors are the tip of the iceberg for an the automatic driving system that is constructed by assembling algorithms that rely on intricacies of decision theory including:

  • Deductive reasoning with heuristics: The process of establishing that a conclusion follows validly from premises i.e., that it must be true given that the premises are true. These are “rules-of thumb” and focus on narrow issues or a particular fact cognitively cheap and fast and frugal. Heuristics denote a tendency to make a choice that is inaccurate – not based on full information set.
  • Deductive reasoning with rules: Uses “if-then-else:” rules statements, but the more possibilities that reasoners have to envisage to draw an inference, the more dif?cult the inference.
  • Inductive reasoning with probabilistic models: The process of deriving plausible conclusions from premises. Decision making often requires considering probabilities.?
  • Inductive reasoning with a learning mode: The process of deriving plausible conclusions from premises. Improve performance in decision making and explanation using experience and study to gain new knowledge.

So that's the theory.

To achieve situational awareness means using decision tools in conjunction with several different signals (frequencies) to "see" and "hear" not just a stereo camera to take images. Sensor fusion is the science of tracking the environment be that cars, barriers or pedestrians walking across the road in a singular view of the world from LIDAR, camera and RADAR sensor data. Where RADAR helps in blind sport warning and collision avoidance because it measures the changing the frequency of the radar waves. Whereas LIDAR (light detection and ranging) uses a different wave length----to scan the environment using a focused laser beam. The result of the laser scan cannot measure velocity, but it can measure range.

Reconstructing a vehicle's real world environment

So perception, but also additional and sufficient information about the environment (and context) is necessary in a high fidelity. It is important for a vehicle to be able to detect and track obstacles be that pedestrians or lane marking. The vehicle then takes the facts of it's world to warn on forward collisions or when a vehicle is about to swerve into a lane.

For example the images below show parts of the process of labeling a lane---whether it is in fact a lane, a dashed line or straight line:

No alt text provided for this image


No alt text provided for this image

Here are some more algorithms that can help detect free space. Free pace around the ego-vehicle (your car) is segmented in local coordinates for any given time stamp:

No alt text provided for this image

How is a classic video that brings several algorithms together so a vehicle can "see"---and all objects, including the car and pedestrian are placed into what's called a 3D bounding box.

The mathematics behind perception and sensor fusion

This “brain” behind computer vision and sensor fusion (as well as other driving functions) are a series of algorithms be that rules, multi-physical models, state machines or machine learning models that handle one event or a family of features/events/:

  1. ?Physics / Rule-based algorithms: they encompass image processing algorithms, point cloud processing algorithms, radar pings, prediction and filtration algorithms, noise cancelling and noise filtration algorithms etc. The advantage here is that they have deterministic and predictable behavior and always behave the same way in certain scenarios so one can easily find the boundaries in which those algorithms will perform as requested. One big drawback is that they have to be fully tested and will perform badly in new or non recognized situations. They give out images, numbers, range of numbers and many more data types.
  2. State machines:?they basically handle black-white situations. This means they have the advantage of making decisions like should I overtake a vehicle or not, should I merge into a lane or not etc. So basically that yes or no decision making. They have the same advantage as algorithms, they are just handling the a driving domain feature and specifically only one feature. This makes them difficult and cumbersome to have. Why is that the case? Because one can have many properties of one feature (stand still, speed, acceleration, different momentums, body roll, etc.) and the state machine has to cover them all. For example in which speed range or with which acceleration range is some maneuver safe to be executed. Overtaking takes acceleration but merging into a lane can go either way accelerating or slowing down to merge to a lane. Even more complicated would be speed and acceleration can be combined so certain combinations could be considered as dangerous and should not be executed. The output can be yes or no, but also numbers or requests for the driver's control.
  3. Machine-learning models: With these types of decision models, which are statistics based, the more data we feed them and the more adaptations and iterations of adaptations we make the more secure there are. The slight problem is that it's impossible to cover all possible scenarios and driving situations that exist. There is just the possibility of defining the probability that one encounters one driving situation per 100000 km. This makes them great for really complex problems but makes them also vulnerable to unpredicted situations. They give out probabilities which is a problem and a big one, because it makes it difficult to interpret. Nobody knows what 96% assurance probability means for a computer AI algorithm.

All of those items are not an entire vehicle's automatic driving system, they are small subsystems. There are many subsystems including perception, sensor fusion & localization (where am I), planning a path & route, navigation and controlling the motion.

Seeing around corners

Traffic lights are blinking and there is construction---and there are unknown objects on the roadway. What do to? Is there sufficient situational awareness from a single vehicles sensors and internal perception systems? There is uncooperative drivers at the intersection. Does that make things even more uncertain?

Alone adding another sensor to the vehicle itself is not the answer to deal with unpredictable drivers or hazardous situations. High definition maps help. But it is necessary to continue to think of ways to enhance a car's situational awareness to answer some tough questions:

  • What happens when there are no lane markings?
  • How does a car know when it needs to give back control?
  • What happens to the driver engagement/attention when a car wants to give back control?
  • How do you deal with scenarios which a vehicle has not seen before?

When the road and the car start to cooperate then you can introduce new levels of situational awareness. For example with vehicle to vehicle or infrastructure communications the following is possible:

  • A pedestrian is about to cross a pedestrian crossing (e.g. traffic light) is equipped with a camera.
  • The traffic light detects the pedestrian and issues awareness/collision warning messages.
  • The car(s) there are connected receive the warning messages from the network and take corrective action such as emergency braking

No alt text provided for this image

When you can imagine messages being shared between vehicles, then you can imagine sharing the entire vehicle's perception. Everything that every car see's the other vehicles can take on as additional situational awareness. Basically other vehicle's are able to evaluate (see feel smell), their neighbors sensor data. This has to be reduced to specific data. The reason is the transmittable data amount.

A project called “IMAGinE" attempted to demonstrate how vehicles can cooperate by sharing driving maneuverers in order to optimize traffic flows. paves the way for cooperative driving. The image below shows how a vehicle that is about to merge onto the highway can exchange of information about merging cars’ planned maneuvers thereby creating a proactive opportunity to open a gap in traffic for the merging car. Misinterpretations of cooperative intentions can be prevented and critical situations avoided.?

No alt text provided for this image


Wishful thinking in the cloud robotics

What about having the cloud (or some additional the network itself) give a helping hand to a car's situational awareness---or more complex decision making. When can we allow a computer vision model coded up as a deep neural network to fallback when in doubt to a "cloud server" maybe due to adverse weather conditions? How do you share instantaneous the states of all neighboring vehicle position, speed and camera perception to improve object classification accuracy?

Daunting questions, but maybe sometime in the next decade we can imagine a scenario where there is some assistance being provided by the cloud---to core driving functions. It is unrealistic for safety reasons today---but maybe one day there will be a case where the trajectory creation and selection decision are offloaded to the edge devices

A key challenge is the latency to a macro cell is 10 - 20 ms. Then comes the processing in the cloud. Then you have to ship the data back and forth. So add another 10 to 20 ms. By that time the car might switch cells. So you need to track the car and reroute the message. That means the latency is too big to offload anything that will happen in the next 2s. Cloud computers have no real time requirements and there is no determinism and assurance of outcome guarantee.

Regardless of how far fetched, we wanted to explore this option where a vehicle is somewhat dependent on the cloud. Maybe an autonomous shuttle is in an airport, venue or some closed circuit environment. The 5G wireless latency is guaranteed because of the network design. The data-center is at the edge of the network--and not half way around the world. In these constraints, maybe then we can imagine that we ship sensor data from the car to a cloud for assistance.

In the image below--we developed such an "augmented autonomous driving system" where the car speed is very slow (limited to 5-10 km/h), but with other tests we could reach 20 kilometers per hour. Going faster was not possible due to the low quality of the network connection and the kind of cameras we installed.

No alt text provided for this image

Here are some practical examples for "cloud assisted" driving:

  • One idea is to offload the calculation the trajectories in a very high definition, and sends the most beneficial as suggestions to the vehicle. The data is calculated on an longer trajectory.
  • If an object classification level of confidence is very poor and object cannot be definitely classified, intelligence shared by other vehicles can be used for improving the embedded autopilot confidence.
  • If the object classification level of confidence is very poor or data fusion is incoherent, vehicle can send all the sensor data to cloud with better computing capability to process the data and give the details of obstacles.
  • If one of the sensors has failed, intelligence shared by other vehicles can be used detecting the obstacles.?
  • To increase the safety of the trip, maybe an extended Autopilot on the Cloud bring additional services (additional modes like off road management, path planning, decision, lane choice..., failure and emergency stop…).

Thank you to:

  • Marco BENICK
  • Miquel MATEU ESTARELLAS
  • Hitesh PANDYA
  • Miguel ARJONA VILLANUEVA
  • Dr Coralie DOUCET-JUNG
  • Hussein SROUR
  • Oussama BEN MOUSSA

APPENDIX: There are levels of autonomy

  • Level 0: No Automation. The driver is completely responsible for controlling the vehicle.
  • Level 1: Driver Assistance. At this level, the automated systems start to take control of the vehicle in specific situations, but do not take over the control, the driver is still fully responsible.
  • Level 2/Level 2+: Advanced/Partial Automation. The Level2+ is not an officially recognized level by the SEA. Some manufacturers use this as a marketing feature. At this level, the vehicle can perform more complex functions that pair steering (lateral control) with acceleration and braking (longitudinal control), thanks to a greater awareness of its surroundings. The driver is still fully responsible for the vehicle actions. This is crucial, for example Tesla markets this feature autonomous driving assist as a fully autonomous system although it is not.
  • Level 3: Conditional Automation. At Level 3, drivers can disengage from the act of driving, but not completely. If there is a condition that impacts the vehicle in such a way that there is no decision possible the driver has to take over the control.?Because drivers can apply their focus to some other task — such as looking at a phone or newspaper — this is generally considered the initial entry point into autonomous driving. Nevertheless, the driver is expected to take over when the system requests it. For example the vehicle has to be able to come to a safe stop if the driver does not.
  • Level 4: High Automation. At this level, the vehicle’s autonomous driving system is fully capable of monitoring the driving environment and handling all driving functions for routine routes and conditions defined within its operational design domain (ODD). Please keep this point in mind, it is very important. The vehicle may alert the driver that it is reaching its operational limits. For example there is an environmental condition that requires a human in control, such as heavy snow because the sensors are not capable of recognizing objects on the road. If the driver does not respond, it will ensure the vehicle either stops or drives to the side of the road automatically.
  • Level 5: Full Automation. This is what most people consider autonomous driving. At this level the vehicles are fully autonomous. No driver is required behind the wheel at all. In fact, Level 5 vehicles might not even have a steering wheel or gas/brake pedals. Level 5 vehicles could have “smart cabins” so that passengers can issue voice commands or maybe type the commands to the vehicle, to choose a destination or set cabin conditions such as temperature or choice of media

Bala Pitchaikani

Growth Hacker | AI, Cloud, Edge | Advisor, Entrepreneur | Product Management, Marketing & BizDev

2 年

good read Walid Negm, thanks

Coralie Doucet-Jung

Center Leader ADAS Perception

2 年

Thanks Walid, That is a good summary what we did in the past years in my team at Capgemini!

UTPAL BHADRA

MSc. AI and Robotics with Advanced Research | Pega CSSA | Machine learning | Neural Network | Member of British Computer Society

2 年

要查看或添加评论,请登录

Walid Negm的更多文章

社区洞察

其他会员也浏览了