State Estimation
Robotics inherently deals with things that move in the world. We live in an era of rovers on Mars, drones surveying the Earth, and, soon, self-driving cars. And although specific robots have their subtleties, there are also some common issues we must face in all applications, particularly state estimation and control.
The state of a robot is a set of quantities, such as position, orientation, and velocity, that, if known, fully describe that robot's motion over time. Here we focus entirely on the problem of estimating the state of a robot, putting aside the notion of control. Yes, control is essential, as we would like to make our robots behave in a certain way. But the first step in doing so is often the process of determining the state. Moreover, the difficulty of state estimation is often underestimated for real-world problems, and thus it is important to put it on an equal footing with control.
Mobile robot localization plays an important role in trying to realize the behaviour of an autonomous robot, where the robot must consistently identify its position while moving in a given map. The main issue of the mobile robot localization is the mobile robot must continuously affirm its location in order to successfully accomplish its given task.
State Estimation with GPS
State estimation systems have become smaller, more ubiquitous, and have integrated new sensors. Much of this was driven by GPS. Before GPS, state estimation systems were often built around large, expensive sensors, and found application in large, expensive vehicles, such as aircraft and spacecraft. GPS provided a small and cheap system which enabled localization almost anywhere on the planet. This opened the door for use in new markets such as automobiles, personal navigation, and robotics. Along with the proliferation of small consumer electronics — namely smartphones — this has driven miniaturization of other sensor technologies as well, such as inertial measurement units (IMU) to estimate orientation in addition to position.
GPS does not solve every navigation problem, though. As new capabilities developed in new markets, it started to become apparent that while GPS had helped to enable these developments, it was also limiting them. For instance, GPS does not work well where there is no clear view of the sky, and GPS service can be denied by sophisticated signal blocking techniques. This has driven a surge in development of GPS-denied navigation research. The new research focuses on the fusion of information from many information sources, including sensors such as cameras, which had not traditionally been widely applied in navigation.
How is State Estimation done in GPS denied environments?
In GPS-denied environments, by contrast, we need to incorporate other sensors to help observe position. In some instances, this can be done by simply modifying the operational environment to include radio beacons or fiducials (visual reference points at known locations). Adding these beacons to the environment would enable us to measure position. Robots thus can have a suite of different sensors, including cameras, 3D cameras, and LIDARs, that are used to support navigation.
The processing required to extract useful information for navigation is different for each sensor but is conceptually similar. We can understand the basic idea by considering visual navigation, which entails the use of cameras for state estimation. Imagine you are driving on the highway, looking out the window. If you didn’t already know what direction you were travelling, you could look through the window at the scenery flying by and figure it out. In visual navigation, we would do this by pointing a digital camera out the window to collect video while driving. We would then analyze two successive images captured by the video to observe how the scenery changed, which tells us something about how the camera has moved in the time between the two images. For instance, if you observed that the scenery moved slightly to the right, this implies that the camera — and therefore your vehicle — has moved to the left in the camera’s reference frame. This is the basic idea behind visual navigation.
Other variables also would need to be taken into account. In particular, the above example considered only translational motion — the motion by which an object shifts from one point in space to another. The same shift in scenery between the two images could also be created through rotational motion if the camera was sitting still and rotating to its left. Therefore, visual navigation algorithms must differentiate between rotational and translational motion and also handle challenges such as image blur due to motion and challenging lighting conditions.
If we look at each pair of successive images in our video, we can add up all of the motion estimates between them to get an estimate of the current position. However, each of these estimates has some small error. The more estimates we add (i.e. the longer the video), the larger the error gets. We refer to this concept as position estimation drift.
In order to mitigate the problem of position estimation drift, we use a class of algorithms called simultaneous localization and mapping (SLAM). As the name suggests, SLAM algorithms work by extracting structure from each image and adding that structure into a map. SLAM algorithms enable us to compare all previous images and measure the error between the current image and the oldest image that contains some of the same structure. This is particularly powerful when we revisit a part of the map that has created many images in the past. We can effectively eliminate all of the position estimation drift that we had incurred since that part of the map was created. This allows us to achieve GPS-like accuracy even when we don’t have GPS.
#mobilerobots #autonomousvehicles #selfdrivingcars #localisation #stateestimation #slam
#computervision #ai #sensorfusion #mapping #autonomousmobilerobots