from FFT to SVD
Look into what SVD can do for us through the lens of FFT.
FFT, Fast Fourier Transform, so famous that needless me to explain further here.
SVD, Singular Value Decomposition. In short definition, it’s a factorization technique from linear algebra world, used extensively in data science.
We are not going to dive into the mathematics of SVD in this post, there are many good materials around detailing that. Given my DSP root, I will try to look into what SVD can do for us through the lens of FFT, hopefully it would be more fun and intuitive.
First, we need a real example and we will use something very familiar with, speech. Pack lots of speech data into 64 samples long frames and feed them into a SVD calculation tool, we will get 3 matrixes back, [U, S, V]. U is the one we want to focus here, it contains fundamental components that when combined, it can reconstruct any of the fed in speech frames. Sounds familiar? Following diagram illustrates this example application of SVD, in comparison with FFT.
?
As illustrated above, the transform process is done by convolution between the input signal and vectors in U. This convolution process is like the butterfly algorithm for calculating FFT.
Now let’s zoom into U. It is a 64x64 matrix, consists of 64 column vectors (convolution filters) where each is 64 samples long. Each of these 64 vectors represents a fundamental component in the new domain, just like a frequency bin in FFT. More interestingly, these vectors are arranged in a prioritized manner, with the most important one at the front. How handy is that!
We all familiar with frequency bin, energy and phase of a single frequency. But what those U vectors look like? Here are the first 5 of them.
It turns out not that different from frequency bin, they are all very tidy sine waves. This makes things a lot more interesting, because we could use FFT to make sense of the U vectors and here it is, the spectrum of those first 5 U vectors.
领英推荐
All of them are basically DC offset + strong single frequency. Some of you might already notice the phase information. Take the yellow and purple line for example, both have pretty much the same 500Hz frequency, the difference is in the DC offset amount and phase angle. This is very different to FFT, which has 0Hz bin dedicated to represent DC offset and give each frequency bin complete freedom in phase angle.
Side topic, the FFT phase information are often untouched in practice due to it’s too difficult to use. The reason being easily corrupted by noise and the wrapping doesn’t help either. On the other hand, SVD treats the phase angle (and DC offset) just like frequency, specifying a fixed value for each component (U vector). Well, it has to, since it uses single value to represent the weight of each component (convolution filter output is a single value). My intuition tells me, this way of encoding the phase might make it more usable.
Back to dissecting U vectors, are all U vectors like the first 5 we just seen? Not exactly, here are the 34th and 35th vector, clearly consist of more than one frequency, a higher frequency signal modulated by a lower frequency signal.
Although these U vectors look very free-style, there is one important characteristic shared by both frequency bins and the U vectors, it’s they are orthogonal. For instance, the dot product of the 34th and 35th vectors plotted above is -2.8e-16, close enough to zero. Even I pick two vectors with strong DC offset (green and purple lines of earlier plot), their dot product is 1.47e-16. Don’t ask me how, it’s just magic ???
Ok, enough of numbers and magic, back to plain English. What this example tells us about what SVD does:
SVD decompose signals into non-correlated components and also rank these components in the order of importance.
A tool that can tell us what the important fundamental components are from a sea of data, no doubt it’s used extensively in data science then.
This powerful tool is not without its unique cost though. We know how to read the output of 128 points FFT, each data point carries fixed meaning. In contrast, the output of transform using a U matrix means very little without knowledge of that particular U matrix. Like what we’ve done earlier, understand the amount of DC offset, frequency points and phase angles represented by each U vector. These knowledge then allow us to correspond the transform output to more meaningful elements. All these are because SVD is a data driven technique, it calculates the transform matrix (U) based on input dataset.
So is that the conclusion, a powerful tool but not easy to use. Well, what if we (human) are not the one to read the output…