Extracting Road and Canal Networks from Aerial Imagery.
Reconstruction of road networks from aerial images is a classic computer vision problem, which remains actively studied to this day. By contrast, the reconstruction of drainage canals has so far remained out of focus of most of the vision community. However, it is of crucial importance for hydrologic analysis, which is becoming ever more important at a time of rapid climate change. Due to their network-like structure, canals are amenable to reconstruction by the same algorithms as roads, and we address these two problems jointly.
Most of the existing approaches rely on convolutional networks to extract from images binary masks denoting which pixels belong to roads and which do not. Unfortunately, they do not guarantee that the connectivity of the produced masks corresponds to that of the real road network.
This is because these methods are trained to minimize losses, such as cross-entropy and mean squared error, that do not explicitly enforce topological consistency. When the annotations do not perfectly coincide with the imaged structures, which is always the case of satellite image annotations, networks trained with the per-pixel losses produce binary masks plagued by topological errors, such as road interruptions, missed junctions, and false positive connections.
In recent literature, this problem has been addressed by combining a convolutional encoder with a decoder that represents a network of roads as a graph, as opposed to a binary mask. At inference time, the graph is grown iteratively: At each step, the neural network adds a new node to the graph by taking image features and the current state of the graph into account. By contrast to the approach based on representing a road map as a binary mask, these graph-based methods make it easy to prevent excessively penalizing predicted roads that deviate slightly from their ground truth models, and to account for existing connectivity when growing the graph.
However, the non-differentiability of the node insertion operation makes training these networks more difficult and brittle than training convnets. In a paper to appear in PAMI, we show that the connectivity of road and drainage canal networks can be enforced directly on a convolutional neural net, in a fully differentiable manner, and without the need to represent the graph explicitly. This allows end-to-end training and results in increased performance. Our algorithm has been used by our Stanford colleagues to model drainage canals across Southeast Asian peatlands to reveal widespread hydrologic disturbances, as reported in paper to appear in AGU Advances.
Our approach involves relaxing the usual requirement of coincidence of annotated and predicted foreground pixels. Instead, we require that predictions contain uninterrupted sequences of foreground pixels that can deviate by a few pixels from the ground-truth annotations. This enforces connectivity while dealing with possibly imprecise annotations.