Focusing on inherent structure of sensor data
How to develop efficient algorithms that exploit this structure?
Sensor data has structure. Paying attention to the structure can help us understand domain-specific priors that can be imparted to novel data-driven architectures. Some examples include: shift and distortion invariance in images, shift invariance in sensor data, spectral stability in both images and time series data. Models need to preserve the symmetry, invariance, and equivariance of the data being studied.
Here are some examples of structure in different types of data:
To understand this better, let’s divide the underlying structure into the following two categories:
1. Euclidean structure:
Euclidean structure is characterized by linear spaces or grid-like structures that adhere to the principles of Euclidean geometry. Examples of such data include audio, images, 1-D sensor data, text, etc.
2. Non-Euclidean structure:
Data having non-Euclidean structure exist in spaces that do not conform to the axioms of Euclidean geometry, and have features like: curved spaces, lack of global co-ordinate system, and variable distance measures. Graphs and manifolds are two primary examples of data having non-Euclidean geometry.
We will look at three specific examples of how structure (symmetry, invariance, and equivariance) of data have been used to design network architectures:
1. CNNs:
“Convolutional Networks combine three architectural ideas to ensure some degree of shift, scale, and distortion invariance: local receptive fields, shared weights (or weight replication), and spatial or temporal subsampling
An interesting property of convolutional layers is that if the input image is shifted, the feature map output will be shifted by the same amount, but will be left unchanged otherwise This property is at the basis of the robustness of convolutional networks to shifts and distortions of the input.
Once a feature has been detected, its exact location becomes less important. Only its approximate position relative to other features is relevant. Not only is the precise position of each of those features irrelevant for identifying the pattern, it is potentially harmful because the positions are likely to vary for different instances of the character. A simple way to reduce the precision with which the position of distinctive features are encoded in a feature map is to reduce the spatial resolution of the feature map. This can be achieved with a so called subsampling layers which performs a local averaging and a subsampling, reducing the resolution of the feature map, and reducing the sensitivity of the output to shifts and distortions.”
2. Defining invariant operations on graphs and manifolds:
Operations such as convolutions and translation don’t directly apply to non-Euclidean data. Let’s consider the following two examples:
A heat kernel changes its shape with a change in position on non-Euclidean domains (such as graphs and manifolds shown below), indicating a lack of shift-invariance. This means that operations like convolutions are not readily applicable on non-Euclidean domains.
Similarly, we need to define deformation-invariant convolutional filters for Geometric CNNs. As can be seen in the figure below, a Euclidean CNN is not invariant to distortion as it changes the shape from 2x2 per quadrant to 2x3 per quadrant if we directly take a projection onto a curved surface.
The readily applicable convolution theorem in the Euclidean domain needs to be modified for the non-Euclidean domain due to the lack of shift invariance.
For Euclidean domain:
领英推荐
For non-Euclidean domain:
Here are some important notes to highlight the difference in the use of operators in the Euclidean and non-Euclidean space. Go here for more details.
Fourier basis - convolution - Euclidean space
Eigenfunctions - Laplacian - spectral analysis - convolution like - non-Euclidean space
Laplacian - eigenfunctions generalize the classical Fourier bases, allowing to perform spectral analysis on manifolds and graphs
3. Tangent kernels for manifold learning:
Data on manifolds naturally arise in different fields. Some examples include:
3.1 Hyperspheres: model directional data in molecular and protein biology
3.2 Hyperbolic spaces: impedance density estimation
3.3 Symmetric positive definite matrices: Diffusion tensor imaging, functional MRI, ASR
3.4 Lie transforms: articulated objects like the human spine
3.5 Stiefel manifolds: process video action data
3.6 Grassmann: computer vision to perform video-based face recognition and shape recognition
3.7 Landmark spaces: Biological shapes
The following diagram shows a mapping between manifold and tangent spaces. You can read more here.
There are 3 important steps in manifold learning:
These translations to and from the manifold space to the tangent space can help in linearizing the problem and subsequently using the tools from Euclidean machine learning for analyzing data having manifold-like structure.
An easy to visualize example of structure in the data is through Takens embedding:
As can be seen above, structure of a gravitational wave is clearly visible in Takens embedding, whereas no such structure is observed in the case of pure noise.
From the above examples, we can see that once we have sufficient insights from the structure of the data, it is easier to think how to design architectures that can preserve properties such as symmetry, invariance, and equivariance. A good example of this is how the network architecture for Alphafold 2 incorporated physics of the structure such as bond angles, energy, etc, from Alphafold 1. A good approach for solving hard scientific discovery problems using deep learning is to keep incorporating previous priors into novel architectures.
Pre-final Year @ IIEST | Tech & Innovation Enthusiast | EDC, CodeIIEST, Debsoc | Ex-WebDev @Lightscline | Former Intern @NIT Patna
2 个月Insightful