Im running a little app that encodes and decodes an audio array with the speex codec in javascript: https://github.com/dbieber/audiorecorder
with a small array filled with a sin waveform
for(var i=0;i<16384;i++)
data.push(Math.sin(i/10));
this works. But I want to build a VOIP application and have more than one array. So if I split my array up in 2 parts encode>decode>merge, it doesn't sound the same as before.
Take a look at this:
fiddle: http://jsfiddle.net/exh63zqL/
Both buttons should give the same audio output.
How can i get the same output in both ways ? Is their a special mode in speex.js for split audio data?
Speex is a lossy codec, so the output is only an approximation of your initial sine wave.
Your sine frequency is about 7 KHz, which is near the upper codec 8KHz bandwith and as such even more likely to be altered.
What the codec outputs looks like a comb of dirach pulses that will sound like your initial sinusoid as heard through a phone, which is certainly different from the original.
See this fiddle where you can listen to what the codec makes of your original sine waves, be them split in half or not.
//Generate a continus sinus in 2 arrays
var len = 16384;
var buffer1 = [];
var buffer2 = [];
var buffer = [];
for(var i=0;i<len;i++){
buffer.push(Math.sin(i/10));
if(i < len/2)
buffer1.push(Math.sin(i/10));
else
buffer2.push(Math.sin(i/10));
}
//Encode and decode both arrays seperatly
var en = Codec.encode(buffer1);
var dec1 = Codec.decode(en);
var en = Codec.encode(buffer2);
var dec2 = Codec.decode(en);
//Merge the arrays to 1 output array
var merge = [];
for(var i in dec1)
merge.push(dec1[i]);
for(var i in dec2)
merge.push(dec2[i]);
//encode and decode the whole array
var en = Codec.encode(buffer);
var dec = Codec.decode(en);
//-----------------
//Down under is only for playing the 2 different arrays
//-------------------
var audioCtx = new window.AudioContext || new window.webkitAudioContext;
function play (sound)
{
var audioBuffer = audioCtx.createBuffer(1, sound.length, 44100);
var bufferData = audioBuffer.getChannelData(0);
bufferData.set(sound);
var source = audioCtx.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioCtx.destination);
source.start();
}
$("#o").click(function() { play(dec); });
$("#c1").click(function() { play(dec1); });
$("#c2").click(function() { play(dec2); });
$("#m").click(function() { play(merge); });
If you merge the two half signal decoder outputs, you will hear an additional click due to the abrupt transition from one signal to the other, sounding basically like a relay commutation.
To avoid that you would have to smooth the values around the merging point of your two buffers.
Note that Speex is a lossy codec. So, by definition, it can't give same result as the encoded buffer. Besides, it designed to be a codec for voice. So the 1-2 kHz range will be the most efficient as it expects a specific form of signal. In some way, it can be compared to JPEG technology for raster images.
I've modified slightly your jsfiddle example so you can play with different parameters and compare results. Just providing a simple sinusoid with an unknown frequency is not a proper way to check a codec. However, in the example you can see different impact on the initial signal at different frequency.
buffer1.push(Math.sin(2*Math.PI*i*frequency/sampleRate));
I think you should build an example with a recorded voice and compare results in this case. It would be more proper.
In general to get the idea in detail you would have to examine digital signal processing. I can't even provide a proper link since it is a whole science and it is mathematically intensive. (the only proper book for reading I know is in Russian). If anyone here with strong mathematics background can share proper literature for this case I would appreciate.
EDIT: as mentioned by Kuroi Neko, there is a trouble with the boundaries of the buffer. And seems like it is impossible to save decoder state as mentioned in this post, because the library in use doesn't support it. If you look at the source code you see that they use a third party speex codec and do not provide full access to it's features. I think the best approach would be to find a decent library for speex that supports state recovery similar to this