I'm trying to use a IIR LP filter in C#. It is a 5th order Butterworth filter.
The code works in 64Bit mode but breaks in 32Bit mode. Debugging showed, that the parameters are slightly different and the output raises to infinity/NAN.
I am using doubles for the calculation and storing.
The correct parameters a[i],b[i] are:
-5 -4,9792522401964
10 9,91722403267282
-10 -9,87615728025693
5 4,91765142871949
-1 -0,979465940928259
The 32Bit calculation gets these:
-5 -4,97925281524658
10 9,91722583770752
-10 -9,87615966796875
5 4,91765308380127
-1 -0,979466378688812
Code for filtering:
public void FilterBuffer(float[] srcBuf, long srcPos, float[] dstBuf, long dstPos, long nLen)
{
const double kDenormal = 0.000000000000001;
double denormal = m_invertDenormal ? -kDenormal : kDenormal;
m_invertDenormal = !m_invertDenormal;
for (int sampleIdx = 0; sampleIdx < nLen; sampleIdx++)
{
double sum = 0.0f;
m_inHistory[m_histIdx] = srcBuf[srcPos + sampleIdx] + denormal;
for (int idx = 0; idx < m_aCoeff.Length; idx++)
sum += m_aCoeff[idx] * m_inHistory[(m_histIdx - idx) & kHistMask];
for (int idx = 1; idx < m_bCoeff.Length; idx++)
sum -= m_bCoeff[idx] * m_outHistory[(m_histIdx - idx) & kHistMask];
m_outHistory[m_histIdx] = sum;
m_histIdx = (m_histIdx + 1) & kHistMask;
dstBuf[dstPos + sampleIdx] = (float)sum;
}
}
The history is 32 entries, hence a histMask of "31" to avoid modulo/comparison...
Any ideas why this does not work and how to get it stable?
IIR filters are notorious for being sensitive to numerical precision, and especially so as the number of terms in the recurrence equation increases. Fortunately, in this case it is possible to obtain a different filter topology which is a bit less sensitive by implementing the filter as a cascade of smaller 1st and 2nd order sections. Note that while the code you provided may be used directly to implement the filter sections, you may realize that as an added bonus those fixed-order building blocks can be optimized much more effectively.
Before deriving the coefficients for the filter sections, we take a step back to get the filter design parameters. Since you mentioned the filter to be a 5th order low-pass Butterworth filter, I'd assume you chose to leave out a[0]
and b[0]
which are both 1. Working back from the information you've provided, it looks like the filter was generated from an analog filter with a cutoff frequency specification of 45Hz, and mapped to a digital filter operating at a sampling rate of 44100Hz using the bilinear transform . As a sanity check, filter coefficients can be confirmed from MATLAB or this applet:
a b
1, +1
5, -4.9792522401964
10, +9.91722403267282
10, -9.87615728025693
5, +4.91765142871949
1, -0.97946594092826
The first step required to obtain an equivalent filter, is to obtain the poles and zeros. The poles and zeros of this filter can be obtained by either:
- factorizing the polynomials with
a
coefficients (for the zeros) and b
coefficients (for the poles) given above, or
- by applying the bilinear transform on the terms (s-sk), where sk are the poles in the s-plane which can be obtained from for example wikipedia (substituting wc=2pi*(45/44100) radians, given your filter design specifications).
The resulting zeros are at z=-1 (all 5 of them). Similarly the resulting poles in the z-plane are:
0.998002182897612 +/- 0.00608551812433764 j
0.994819411183486 +/- 0.00374906249111718 j
0.993609052034203
The 2nd order filter sections can then be obtained by matching complex-conjugate pairs of poles, and combining with an equivalent number of zeros.
Thus combining a pole at z = x + y j with its complex conjugate and with 2 zeros (at z=-1), the coefficients for the filter section are:
1, 1
2, -2x
1, x*x+y*y
Similarly, the 1st order filter section can be obtained for a real pole at z = x, together with a zero (at z=-1):
1, 1
1, -x
The 3 filter sections of the 5th order filter provided are thus:
// 1st section
// using poles at 0.998002182897612 +/- 0.00608551812433764j, and 2 zeros at -1
1, 1
2, -1.996004365795224
1, 0.996045390599240
// 2nd section
// using poles at 0.994819411183486 +/- 0.00374906249111718j, and 2 zeros at -1
1, 1
2, -1.989638822366971
1, 0.989679716337019
// 3rd section
// using pole at 0.993609052034203 and a zero at -1
1, 1
1, -0.993609052034203
As indicated earlier, the output is generated by cascading the sections. Thus obtaining something such as the following sequence:
f1.FilterBuffer(srcBuf, srcPos, tmpBuf1, 0, nLen);
f2.FilterBuffer(tmpBuf1, 0, tmpBuf2, 0, nLen);
f3.FilterBuffer(tmpBuf2, 0, dstBuf, dstPos, nLen);
Note that some temporary storage could be reused, but this was left out for clarity.