Hi I need to downsample a wav audio file's sample rate from 44.1kHz to 8kHz. I have to do all the work manually with a byte array...it's for academic purposes.
I am currently using 2 classes, Sink and Source, to pop and push arrays of bytes. Everything goes well until I reach the part where I need to downsample the data chunk using a linear interpolation.
Since I'm downsampling from 44100 to 8000 Hz, how do I interpolate a byte array containing something like 128 000 000 bytes? Right now I'm popping 5, 6 or 7 bytes depending on i%2 == 0, i%2 == 1 and i%80 == 0 and push the average of these 5, 6 or 7 bytes into the new file.
The result is indeed a smaller audio file than the original but it cannot be played on windows media player (says there is an error while reading the file) and there is a lot of noise although I can hear the right track behind the noise.
So, to sum things up, I need help concerning the linear interpolation part. Thanks in advance.
I think you shouldn't use the average of those samples as that would be a median filter, not exactly downsampling. Just use every 5th/6th/7th sample and write that to the new file.
That will probably have some aliasing artifacts but might overall be recognizable.
Another, more complex solution but probably one with better results, quality-wise, would be to first convert your samples into a frequency distribution using a FFT or DFT and then convert it back with the appropriate sample rate. It's been a while since I have done such a thing but it's definitely doable. You may need to fiddle around a bit to get it working properly, though.
Also when not taking a FT of the complete array but rather in segments you have the problem of the segment boundaries being 0. A few years ago when I played with those things I didn't come up with a viable solution to this (since it generates artifacts as well) but there probably is one if you read the right books :-)
As for WMP complaining about the file: You did modify the header you write accordingly, right?