Computing the discrete fourier transform of audio

2019-06-03 02:12发布

I am quite new to signal processing so forgive me if I rant on a bit. I have download and installed FFTW for windows. The documentation is ok but I still have queries.

My overall aim is to capture raw audio data sampled at 44100 samps/sec from the sound card on the computer (this task is already implemented using libraries and my code), and then perform the DFT on blocks of this audio data.

I am only interested in finding a range of frequency components in the audio and I will not be performing any inverse DFT. In this case, is a real to real transformation all that is necessary, hence the fftw_plan_r2r_1d() function?

My blocks of data to be transformed are 11025 samples long. My function is called as shown below. This will result in a spectrum array of 11025 bins. How do I know the maximum frequency component in the result?

I believe that the bin spacing is Fs/n , 44100/11025, so 4. Does it mean that I will have a frequency spectrum in the array from 0 Hz all the way up to 44100Hz in steps of 4, or up to half the nyquist frequency 22200?

This would be a problem for me as I only wish to search for frequencies from 60Hz up to 3000Hz. Is there some way to limit the transform range?

I don't see any arguments for the function, or maybe there is another way?

Many thanks in advance for any help with this.

p = fftw_plan_r2r_1d(11025, audioData, spectrum, FFTW_REDFT00, FFTW_ESTIMATE);

1条回答
成全新的幸福
2楼-- · 2019-06-03 02:52

To answer some of your individual questions from the above:

  • you need a real-to-complex transform, not real-to-real
  • you will calculate the magnitude of the complex output bins at the frequencies of interest (magnitude = sqrt(re*re + im*im))
  • the frequency resolution is indeed Fs / N = 44100 / 11025 = 4 Hz, i.e. the width of each output bin is 4 Hz
  • for a real-to-complex transform you get N/2 + 1 output bins giving you frequencies from 0 to Fs / 2
  • you just ignore frequencies in which you are not interested - the FFT is very efficient so you can afford to "waste" unwanted output bins (unless you are only interested in a relatively small number of output frequencies)

Additional notes:

  • plan creation does not actually perform an FFT - typically you create a plan once and then use it many times (by calling fftw_execute)
  • for performance you probably want to use the single precision calls (e.g. fftwf_execute rather than fftw_execute, and similarly for plan creation etc)

Some useful related questions/answers on StackOverflow:

There are many more similar questions and answers which you might also want to read - search for the and tags.

Also note that dsp.stackexchange.com is the preferred site for site for questions on DSP theory rather than actual specific programming problems.

查看更多
登录 后发表回答