What I'm doing
I'm using the getUserMedia API to record audio in the browser and then send this audio to a websocket server. Furthermore, to test the recordings, I use soundflower on a Mac as an input device, so I can play a wave file, instead of speaking into a microphone.
client side (JavaScript)
window.AudioContext = window.AudioContext || window.webkitAudioContext;
navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia;
var audioContext = new AudioContext();
var wsClient = new WebSocket("ws://" + WEBSOCKET_URL + ":" + WEBSOCKET_PORT);
navigator.getUserMedia({audio: true}, function (stream) {
var input = audioContext.createMediaStreamSource(stream);
var recordNode = audioContext.createScriptProcessor(4096);
recordNode.onaudioprocess = recorderProcess;
input.connect(recordNode);
recordNode.connect(audioContext.destination);
}, function (e) {
console.error("No live audio input: " + e);
});
function recorderProcess(e) {
var buffer = e.inputBuffer.getChannelData(0);
wsClient.send(buffer);
}
server side (python)
On the server side, I simply write the chunks in a file:
def onMessage(self, msg, binary):
if binary:
with open("/tmp/test.raw", "ab") as f:
f.write(msg)
The Problem
The problem, I'm having, is that the audio seems to get pre-processed by the browser, so that the end result is of different quality than the original audio. The quality also depends on the browser.
Here is an example:
The picture show the three waveforms of the original audio, the result of recording in Chrome and the result of recording in FireFox. As you can see, the wave forms appear different. Especially in Chrome, where low amplitude often gets just converted to zero.
An even bigger difference is visible when looking at the spectogram:
So both browsers seem to cut off higher frequencies, while FireFox is certainly more extreme.
All of this would maybe not a big deal, since the audio files are all sounding very similar to my ears. But I'm processing and analyzing the audio on server side and the pre-processing by the browsers is giving me worse end results.
Question
So what is going on? Do these browsers have an extra step of pre-processing the audio? What kind of filters do they presumably apply? Can I avoid this somehow within the getUserMedia API? And are there any ways of having a solution, that results in a consistent good audio quality in Chrome and FireFox?
Disclaimer
I'm not an audio expert, so I can only analyze the results in a very amateur way, but I hope the graphics speak for themselves.