this question is related to : DSP : audio processing : squart or log to leverage fft?
in which I was lost about the right algorithm to choose.
Now,
Goal :
I want to get all the frequencies of my signal, that I get from an audio file.
Context:
I use numpy, and scikits.audiolab. I made a lot of reading on the dsp subject, went to dspguru.com as well, read papers and nice blogs over the net.
The code I use is this one :
import numpy as np
from scikits.audiolab import Sndfile
f = Sndfile('first.ogg', 'r')
# Sndfile instances can be queried for the audio file meta-data
fs = f.samplerate
nc = f.channels
enc = f.encoding
print(fs,nc,enc)
# Reading is straightfoward
data = f.read_frames(10)
print(data)
print(np.fft.rfft(data))
I am new to DSP.
My question
I would like to be able to separate all the frequencies of a signal to compare different signals. I use numpy.fft.rfft on the array of sound; But now, this operation alone is not enough. So, what is the best solution to get all the magnitudes of frequencies correctly ?
I saw that multiplying the resulting values get the complex numbers off and transform the whole as a real number.
Now what please ? Is that it ?
if you need me to clarify anything, just ask.
Thanks a lot !
Mathematically Fourier Transform returns complex values as it is transform with the function
*exp(-i*omega*t)
. So the PC gives you spectrum as a complex number corresponding to the cosine and sine transforms. In order to get the amplitude you just need to take the absolute value:np.abs(spectrum)
. In order to get the power spectrum square the absolute value. Complex representation is valuable as you can get not only amplitude, but also phase of the frequencies - that may be useful in DSP as well.You say "I want to get all the frequencies of my signal, that I get from an audio file." but what you really want is the magnitude of the frequencies.
In your code, it looks like (I don't know python) you only read the first 10 samples. Assuming your file is mono, that's fine, but you probably want to look at a larger set of samples, say 1024 samples. Once you do that, of course, you'll want to repeat on the next set of N samples. You may or may not want to overlap the sets of samples, and you may want to apply a window function, but what you've done here is a good start.
What sleepyhead says is true. The output of the fft is complex. To find the magnitude of a given frequency, you need to find the length or absolute value of the complex number, which is simply sqrt( r^2 + i^2 ).
If I got it right, you want walk over all data(sound) and capture amplitude, for this make a "while" over the data capturing at each time 1024 samples