Finite Impulse Response Filters Part 1: Convolution and HDL
Simon Southwell
Semi-retired logic, software and systems designer. Technical writer, mentor, educator and presenter.
Introduction
In these articles on convolution and finite impulse response filters I want to move into the world of digital signal processing. Strictly speaking my articles on data compression, including image compression with JPEG, fall under this category but, I think, this article on finite impulse response filters is what most people would consider as definitely in the realm of DSP.
Finite impulse response filters (or FIRs) are based around the process of ‘convolution’, and we will have to have a look at some equations to explain what’s going on, but I promise that there is nothing particularly taxing about this. When we get to a practical solution, we will be down to multiply-and-accumulate and nothing more complicated than that. We will also look at impulse responses—that is, the output you get from a system when an impulse (or delta) is input—which tend to be infinite (assuming unlimited precision). It turns out that if you convolve the impulse response of the desired filter with your signal it will filter that signal as per the filter characteristics. In fact, convolving in the time domain is the same as multiplying in the frequency domain (and vice versa). So, if we wanted a low pass filter, say, in the frequency domain this means that all frequencies from 0 to fc we want to multiply by 1, and all frequencies above fc we want multiplied by 0. This is the ideal frequency response, and the diagram below shows this for a low pass filter.
Even more handy, multiplying everything by 1 in the frequency domain is equivalent convolving a ‘delta function’ in the time domain—i.e., an impulse. Thus, if we know the impulse response of the filter, we can convolve this with our signal and it will filter it, just as if we had taken the signal into the frequency domain, multiplied it by our frequency response (by 1 for low frequencies, and by 0 for high frequencies), and put it back in the time domain again. In fact, this is another way to filtering which is outside of the scope of this article, but we might visit this again when looking digital fourier transforms and the fast fourier transform algorithm in the future. So, we ‘simply’ need to work out the impulse response values of our desired filter and convolve these with our signal. We will look at this in this article and a handy open-source program is provided to help us do just this.
In practical solutions, dealing with infinity is somewhat of problem, but we can truncate things as values tend towards zero, making things finite—and this is where the finite part of the FIR comes from, with the ‘IR’ from the impulse response. The truncation of impulse response values effectively gives a ‘window’ on that response, with everything outside of that window set to zero. Now, simply truncating the values causes unwanted artifacts and things can be improved if the window on the impulse response is shaped in a more gentle type of roll-off. There are many types of window functions that have been discovered/developed over time that improve the situation of reducing the artifacts of truncating an infinite impulse response, and we shall be looking at some of these and what the trade-offs are. The program I mentioned implements many of these (though we shall restrict ourselves to just a few) and so the different performances of the window functions can be explored.
Another source of undesired artifacts come from the fact that we can’t have infinite precision in the values we use for our impulse response but must ‘quantise’ these to be of finite precision, with a set number of bits, such as 8- or 16-bit values. As we shall see, this tends to limit the amount of attenuation in the ‘stop-band’ that can be achieved, and the number of bits chosen needs to match the requirements of the filter being implemented.
Finally, the impulse responses of ideal filters are made up with ‘sinc’ functions, or combinations thereof. A sinc function has a general form of sine(x)/x, which is not too complicated. The diagram below shows a plot for a generic sinc function.
As can be seen, the sine component provides the oscillations and the 1/x component gives the decay with the plot centred on 0, thus negative x on the left and positive x on the right. The mathematicians amongst you may have spotted the if x is 0 the function is infinite, but that is not what’s shown. As x tends to 0 the plot approaches 1, and so this limit is the value at 0. (A bit of a mathematical cheat, I known, but I’m not mathematician enough to know why this is okay.)
Using these functions, as we shall see in the second article, gives the infinite impulse responses of our ideal filter and multiplying these with a window function to make finite gives us our finite impulse response. Thus, these are sometimes called windowed-sinc filters.
Now, don’t worry if this introduction has introduced a lot of concepts in a short space of time as we will revisit these in some more detail in the rest of the articles. And, as mentioned before, when we get to an implementation in HDL, we will only be doing multiplication and addition. All the messing about with trigonometrical functions can be done off-line to calculate the values we will need in the logic, and the provided program will do this for us and we could use this ‘turn key’, though I’d very much like you to understand what’s going on.
The article is divided into two parts. In this first article we will introduce convolution for digital signals and move straight to discussing an HDL implementation. By the end we will have a solution that can be used to construct an FIR but won’t know what values to configure it with to do the filtering that we desire. That will have to wait for the next article, where we will look at sinc functions and their use in constructing impulse responses for our filters and at window functions to modify the infinite impulse responses to a finite set of values.
Before diving straight into convolution, I want to talk briefly about the general characteristics of FIRs and their advantages and disadvantages.
FIR Characteristics
There is not space in this article to give a full treatment to the alternative to FIRs, which are infinite impulse response (IIR) filters. Without going into too much detail IIRs are characterised by the current output being a function of both the current input and also by previous output values (and inputs). This gives them a potential infinite impulse response, which has many advantages. Compared to FIRs there are fewer coefficients required and thus smaller memory requirements to get similar characteristics in terms of cut off frequency transition and stop-band attenuation. Because there are fewer coefficients, the latency is smaller than for FIRs. Also, IIRs can be constructed in such a way as to be equivalent to analogue filters in terms of mapping between the s (analogue) and z (digital) planes (for those that know what this means—which we can’t cover here). This is handy if digitizing a previous analogue system. There is no such equivalent mapping for FIRs.
So, if I were a salesman selling FIRs, I’ve just convinced you to go down the IIR route, right?
Well, unlike FIRs, IIRs have non-linear phase characteristics and in some applications, such as processing audio or biometric data, this can be quite undesirable. Due to the fact that IIRs have feedback paths, using previous output values, they can become unstable, whereas FIRs can’t become unstable for any input signal? since they are a function of input values and impulse response only. Although these articles will limit what filter response we will investigate, none-the-less, FIRs can have an arbitrary frequency response. FIRs must handle numerical overflow to some extent, but this is easier than with IIRs, making IIR design more complex, and the effects of quantisation more severe.
So which one is used for any given application will depend on the requirements, resources, complexity, and all the normal engineering trade-offs in choosing between competing solutions. For these articles, we will explore FIRs.
Convolution
I mentioned in the introduction that convolution was at the heart of FIRs, and so we need to understand this. However, since we will be implementing a logic based convolution solution, I want to move straight on to that implementation afterwards, in a break from a traditional text ordering, because, once we have this solution, making a desired filter from it is simply a matter setting the ‘right numbers’, and the next article will deal with this. Thus, if one is only interested in a filter implementation, and are happy to get the right numbers from the provided winfilter program, then you can skip the next article—though I genuinely hope you don’t.
What is Convolution
The general form of the convolution of two signals (e.g., an input signal ?and an impulse response signal h) is given as:
If the input signal has N points and the impulse response has M points, then the resultant output (o) will be an N+M-1 point signal. What does this actually mean? Well, each output o[i]?is calculated by multiplying the impulse response values from 0 to M-1 with the signal values from i?to i-M-1 and adding all these up. This is summarised in the equation below.
In other words, if we think of our impulse response as an M point array of values, and our signal as an N point array of signals, then for each output we multiply the M array elements of the impulse response with the last M elements of the signal, starting from ?and running backwards, and add all the M multiplication results together. If the signal is finite in length (N), then if a signal value is indexed outside of its valid range, then it is assumed 0, and the multiplication is 0, and thus no effect on the output value. Let’s try and picture this. The diagram below has an 8 point signal to be convolved with a 4 point impulse response. This should yield an 11 point output.
Each box is an element in an array for the signal (green), impulse response (blue) and output (orange). Each output element is calculated as the sum of the overlapping signal and impulse response elements multiplied together. I’ve tried to relate this to the previous equation, showing the specific summation of multiplications on the right by each output element. So, we’ve ‘swept’ the impulse response across the signal for each output calculation, multiplying and accumulating the overlapping elements. Note that the indexes for the response and the signal run in opposite directions, and this is important for the general case of convolution. It is likely that a signal will be asymmetric, and the secondary signal (an impulse response for our purposes but may not be for other uses of convolution) may also be asymmetrical. Thus, to convolve, one or the other signal must be reversed. It doesn’t matter which one, as we could treat the impulse response as our signal, and the signal as the values to be convolved. As I said, the two sets of values need not be for FIR purposes. As we shall see later, the filter impulse responses will be symmetrical, so it won’t matter, but an implementation should reversal do this so that it can be used for any convolution purpose.
What About Arbitrarily Long Signals?
The example in the diagram assumed a finite set of signal values, but in a real DSP usage, the signal will probably be an unending set of samples to be processed. Notice from the diagram that the impulse response only ever overlaps the signal values over a limited range. Basically, the length of the impulse response. So, so long as the values to be convolved with the signal are of finite length then, if we remember the last M signal values, we can keep producing output values indefinitely.
Now this is beginning to suggest a solution in logic. A multi-bit shift register is good for remembering the last M values of an input. For our purposes, the impulse response we want to convolve with our signal is fixed, and so we just need a look-up-table of values. Then, to convolve the impulse response with the signal for a new output value, we simply multiply and accumulate the elements of the LUT and the shift register. When a new input arrives, and the shift register is shifted, with the oldest value disposed of and the new value added, the system is ready to generate a new output. It’s that simple!
领英推荐
There are some practical considerations to take into account. The above diagram suggests that there are as many multipliers as there are entries in the LUT (known as ‘taps’). A practical impulse response might need to have as many as 256 taps or more, and the accumulator would have to be able to add that many multiplication results at once. If the rate of new input samples is low relative to the maximum clock frequency that, say, the multiplier logic can run at, then the implementation can do a multiply and accumulate once per cycle for M cycles, and if that period is less than the input sample rate, only one multiplier and one adder is required. In between this, if M cycles is too long, then two circuits running in parallel, doing half the taps (bottom/top or odd/even, for example) can be employed. This would take two multipliers and two adders plus an additional adder to add the two final results. Hopefully you can see that this bifurcation can be extended for quarter, eight, sixteenth etc. parallel calculations, trading off speed against logic resources, depending on the requirements. In the example implementation we will discuss later, the clock rate is sufficiently fast that we can use a single multiply-accumulate logic block to process all the taps with time to spare for the target functionality.
Another consideration is the quantisation Q of the signal and ‘tap’ values. The choice here is dependent on the requirements and could range from double precision floating point (64 bits) to 8 bit integers or fixed point values. Having chosen the number of Q bits, it must also be noted that multiplying two numbers of Q bits wide gives a result width of 2Q bits and the adder must be able to take this size input. Also, when adding numbers, the result can be one bit larger than the inputs (let’s assume they’re both the same width). Thus, potentially, to avoid overflow (or underflow), the adder should allow additional bits for each addition—i.e., an additional M bits. In practice, this becomes impractical, particularly if M is configurable (as our example will be, via a parameter). Various ways of dealing with this are available. Firstly, do nothing and let the accumulator rollover. This may seem ‘lazy’, but it might be that upstream processing makes certain range guarantees that means the accumulation calculation can’t over- or under-flow, and additional logic to deal with it is redundant. The next level is to detect under- or overflow and flag an error. This might come in the form of detecting that an addition would change the sign of the result when the input signs match. I.e., adding two positive numbers should not result in a negative number and, similarly, adding two negative numbers should not result in a positive number.? Having detected over/underflow one could then take a further step and ‘saturate’ the result so that the overflowed or underflowed result is discarded to be replaced with the maximum positive or negative number as appropriate.
Having dealt with these problems, we have a result that is 2Q bits wide but, for many applications, we want the result to be Q bits wide to match the input. We could just take the top Q bits from the result, but we can do better than this and introduce less quantisation noise if we add the most significant bit of the truncated bits to the rescaled value to round up. This, then, picks the nearest rescaled value to the original.
Earlier in the article I made the bold claim that convolving a signal with the delta function (an impulse) in the time domain results in multiplying all frequencies by 1 in the frequency domain—in other words and all-pass filter, which doesn’t alter the signal. Our solution gives us an insight into why that might be. Imagine our impulse response is a delta function, with just one of the taps set to 1.0 (or the equivalent in binary). Now, as we sweep this over our signal, the single tap with the 1 in it will pick out only one value from the signal and will pick each one in turn as it sweeps through. All other values will be masked to 0, being multiplied by 0 from the impulse response. The output, then, being the addition of all the results, and all but one of the multiplications being zero, will simply reproduce the input value unaltered. This is true for whatever input signal we choose to put into the convolution. Thus, this acts as an all pass filter.
Conversely, if we make our signal a delta function (impulse), when we convolve this with our impulse response tap values, it will reproduce the impulse response at the output, as it picks one value at a time as it sweeps through. So here we see, in our solution, that convolving in the time domain is the same as multiplying in the frequency domain, and that an impulse in the time domain is the same as an all pass filter.
HDL Implementation
I have put together a configurable Verilog solution, firfilter, on github. Firstly, let’s have a look at its ports. The diagram below shows the fir module block diagram with its signals and parameters.
The module has a clock (clk) and power-on reset (nreset, active low). It also has a synchronous external reset (reset, active high) so that external logic can reset the module outside of POR. There is also a simple subordinate memory mapped bus port which allows the tap values to be programmed from an external source.
The other two ports consist of the input samples (sample) with an accompanying valid (smplvalid), and the output values (out) with its own valid signal (opvalid). The number of bits for the input and output values is determined by the SMPL_BITS parameter (with a default value of 12). Needless to say, these values are going to be treated as signed numbers. The TAPS parameter (with a default of 127) determines the number of logical taps the impulse response LUT will have. It will also determine the width of the address (addr) on the memory mapped port to be able to index all the tap entries.
Actually, as this implementation is targeting an FIR implementation, and because the impulse responses will be symmetrical about the centre value, an optimisation is made to only store half the data, plus the centre value. Thus, the actual storage defined is TAP/2+1 tap entries and logic will be used to recreate the full impulse response. If the size of the LUT is not a limiting factor, then this need not be done, saving on a little bit of logic, and making the implementation a more generic convolution solution. However, I wanted to illustrate some optimisation considerations that might be required when looking at the logic.
One other deviation in this implementation from the generic diagram seen earlier is that the taps are actually SMPL_BITS+1 wide. This allows for the equivalent of 1.0 to be stored, otherwise the maximum value is equivalent to 0.111111111111… (in binary) which gives a slight scaling to the output (an attenuation), the degree of which is dependent of the magnitude of SMPL_BITS (the larger, the less of an error). This may not be an issue for some applications, but this will make things easier to cross-correlate results to what we’ve discussed and not introduce any scaling errors.
Indexing the Tap Values
Before jumping into the convolution logic, we need some logic to extract values from the tap LUT (taps—an array of signed values). These will be programmed with the impulse response values from index 0 to N/2. So, for the default TAPS value of 127, this is 0 to 64. When a new input value arrives (smplvalid is 1 for a cycle), a register count is set to TAPS and counts down to zero, decrementing each clock cycle. For the first half we want to index the LUT from 0 to N/2, and then count back down from N/2-1 down to 0. A combinatorial bit of logic does this calculation assigning a wire tapidx:
wire [$clog2(TAPS)-1:0] tapidx = (countdlyd > (TAPS+1)/2-1) ? ?????????????????????????????????? ~countdlyd : ??????????????????????????????????? count[$clog2(TAPS)-2:0];
Note that countdlyd is the count value delayed by one cycle, which is equivalent to count+1 when the count is active, saving an adder, aligning the value, and relieving the timing.
Multiply and Accumulate
The heart of the convolution is a multiply and accumulate function. In many FPGAs, ‘DSP’ functions are included. For example, the Intel Cyclone V or the AMD Zynq 7000 series FPGAs, both fairly low cost solutions. These DSP blocks are basically multiply and accumulate blocks (though they can be used as just multipliers)—just what we need. One could instantiate these blocks as modules, but the synthesis tools can infer the functionality and construct the blocks we need for us. It does help, though, if we construct the logic to look like a multiply and accumulate equation. In the fir module, the synchronous process has the code:
if (countdlyd)
begin
accum <= (tapval * validsample) + accum;
end
The accum register is just what you’d expect, accumulating the values over the convolution. The tapval is the registered extracted LUT value (taps[tapidx]). The validsample signal is registered version of the shift register (smpls[]) values, as indexed by count, but also validated as containing a value. After POR, the shift register has no valid values and could be random. POR could reset them all, but this could be a lot of bits (TAPS × Q+1) so, in this implementation, a register, smplsvalid, of TAPS bit wide is cleared on reset, and is used as a single bit shift register, filling with 1s at each shift in of the input samples. Thus, validsample is the indexed shift register value if the equivalent bit of smplsvalid is set, otherwise it’s 0. The update of accum happens whilst the count value (delayed) is non-zero. It gets zeroed each time a new input sample arrives.
Rescaling
Lastly, when in the last cycle, the output is updated with the accum value rescaled. The code fragment below shows this:
if (lastcycle)
begin
out <= accum[SMPL_BITS*2-1:SMPL_BITS-1] + accum[SMPL_BITS-2];
opvalid <= 1'b1;
end
Note that the opvalid is set to a default value of 1'b0 at the beginning of the synchronous process so that, when set here, it pulses for just one cycle, being cleared on the next. And that’s all there is to it. We now have the basis of a finite impulse response filter, if only we knew what to put in the tap values. That will be the subject of the next article.
With this design, having a single multiply and accumulate function, a new output takes TAPS + 3 clock cycles to calculate. The extra three cycles allow for registering at both the input and output, and around the multiply-and-accumulate function. Thus, for a 100MHz clock this would be, for the default TAPS size of 127, 1.3μs or a rate of 769KHz. This 1.3μs is also the latency through the filter.
Conclusions
In this first of two articles, digital convolution was introduced showing how two signals can be convolved in the time domain and what this means in the frequency domain. By making one of the signals a finite impulse response we can use convolution to filter a signal.
A generic solution was explored, where there were no restrictions on resources such as memory, multipliers, or the number of terms in an addition. This was used to demonstrate an intuitive understanding of why convolution in the time domain is equivalent to multiplication in the frequency domain, and why a delta function (or impulse) in the time domain is an ‘all-pass’ filter—i.e., multiples everything in the frequency domain by 1.
From this idealised solution some practical constraints were formalised, and a Verilog based implementation was discussed. The solution requires only multiply and accumulate functionality over and above normal logic, and the solution is, I hope, simple and easy to understand. We left this HDL implemented but with no knowledge of how to configure this logic to act as a filter. We still need the numbers.
In the next article, I want to concentrate on generating the tap values required to turn our HDL implementation into a filter. It will cover generating impulse responses using sinc functions, ‘windowing’ the sinc function to make finite, designing these to meet our requirement, and analysing these to validate the responses. To do this we will take a specification from digital audio processing as an example, for filtering 48KHz samples with a 4 × oversampled low-pass filter, a cut off at 20KHz and a stop -band attenuation target of -60dB. We will use my winfilter program to help us do this but will discuss the calculations behind all this for a fuller understanding.
CEO & Founder AsFigo
1 年Thanks for such a detailed post, sincerely appreciate it. Wondering if you have a Matlab/GNU Octave code for the FIR to serve as a reference, please do add it if possible.
Staff Digital Design Engineer at Synopsys
1 年CFBR