I'm trying to analyse the audio quality of a file by detecting the highest frequency present (compressed audio will generally be filtered to something less than 20KHz).
I'm reading WAV file data using a class from the soundstretch library which returns PCM samples as floats, then performing FFT on those samples with the fftw3 library. Then for each frequency (rounded to the nearest KHz), I am totalling up the amplitude for that frequency.
So for a low quality file that doesn't contain frequencies above 16KHz, I would expect there to be none or very little amplitude above 16KHz, however I'm not getting the results I would expect. Below is my code:
#include <iostream>
#include <math.h>
#include <fftw3.h>
#include <soundtouch/SoundTouch.h>
#include "include/WavFile.h"
using namespace std;
using namespace soundtouch;
#define BUFF_SIZE 6720
#define MAX_FREQ 22//KHz
static float freqMagnitude[MAX_FREQ];
static void calculateFrequencies(fftw_complex *data, size_t len, int Fs) {
for (int i = 0; i < len; i++) {
int re, im;
float freq, magnitude;
int index;
re = data[i][0];
im = data[i][1];
magnitude = sqrt(re * re + im * im);
freq = i * Fs / len;
index = freq / 1000;//round(freq);
if (index <= MAX_FREQ) {
freqMagnitude[index] += magnitude;
}
}
}
int main(int argc, char *argv[]) {
if (argc < 2) {
cout << "Incorrect args" << endl;
return -1;
}
SAMPLETYPE sampleBuffer[BUFF_SIZE];
WavInFile inFile(argv[1]);
fftw_complex *in, *out;
fftw_plan p;
in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * BUFF_SIZE);
out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * BUFF_SIZE);
p = fftw_plan_dft_1d(BUFF_SIZE, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
while (inFile.eof() == 0) {
size_t samplesRead = inFile.read(sampleBuffer, BUFF_SIZE);
for (int i = 0; i < BUFF_SIZE; i++) {
in[i][0] = (double) sampleBuffer[i];
}
fftw_execute(p); /* repeat as needed */
calculateFrequencies(out, samplesRead, inFile.getSampleRate());
}
for (int i = 0; i < MAX_FREQ; i += 2) {
cout << i << "KHz magnitude: " << freqMagnitude[i] << std::endl;
}
fftw_destroy_plan(p);
fftw_free(in);
fftw_free(out);
}
Can compile with: - (you'll need the soundtouch library and fftw3 library)
g++ -g -Wall MP3.cpp include/WavFile.cpp -lfftw3 -lm -lsoundtouch -I/usr/local/include -L/usr/local/lib
And here is the spectral analysis of the file I am testing on:
As you can see it's clipped at 16KHz, however my results are as follows:
0KHz magnitude: 4.61044e+07
2KHz magnitude: 5.26959e+06
4KHz magnitude: 4.68766e+06
6KHz magnitude: 4.12703e+06
8KHz magnitude: 12239.6
10KHz magnitude: 456
12KHz magnitude: 3
14KHz magnitude: 650468
16KHz magnitude: 1.83266e+06
18KHz magnitude: 1.40232e+06
20KHz magnitude: 1.1477e+06
I would expect there to be no amplitude over 16KHz, am I doing this right? Is my calculation for frequency correct? (I robbed it off another stackoverflow answer) Could it be something to do with there being 2 channels and I'm not separating the channels?
Cheers for any help guys.