How to downsample audio recorded from mic realtime

I am using following javascript to record audio and send it to a websocket server:

const recordAudio = () =>
    new Promise(async resolve => {

        const constraints = {
            audio: {
                sampleSize: 16,
                channelCount: 1,
                sampleRate: 8000
            },
            video: false
        };

        var mediaRecorder;
        const stream = await navigator.mediaDevices.getUserMedia(constraints);

        var options = {
            audioBitsPerSecond: 128000,
            mimeType: 'audio/webm;codecs=pcm'
        };
        mediaRecorder = new MediaRecorder(stream, options);
        var track = stream.getAudioTracks()[0];
        var constraints2 = track.getConstraints();
        var settings = track.getSettings();


        const audioChunks = [];

        mediaRecorder.addEventListener("dataavailable", event => {
            audioChunks.push(event.data);
            webSocket.send(event.data);
        });

        const start = () => mediaRecorder.start(30);

        const stop = () =>
            new Promise(resolve => {
                mediaRecorder.addEventListener("stop", () => {
                    const audioBlob = new Blob(audioChunks);
                    const audioUrl = URL.createObjectURL(audioBlob);


        const audio = new Audio(audioUrl);
                const play = () => audio.play();
                resolve({
                    audioBlob,
                    audioUrl,
                    play
                });
            });

            mediaRecorder.stop();
        });

    resolve({
        start,
        stop
    });
});

This is for realtime STT and the websocket server refused to send any response. I checked by debugging that the sampleRate is not changing to 8Khz.Upon researching, I found out that this is a known bug on both chrome and firefox. I found some other resources like stackoverflow1 and IBM_STT but I have no idea on how to adapt it to my code. The above helpful resources refers to buffer but all i have is mediaStream(stream) and event.data(blob) in my code. I am new to both javascript and Audio Api, so please pardon me if i did something wrong.

If this helps, I have an equivalent code of python to send data from mic to websocket server which works. Library used = Pyaudio. Code :

 p = pyaudio.PyAudio()
 stream = p.open(format="pyaudio.paInt16",
                        channels=1,
                        rate= 8000,
                        input=True,
                        frames_per_buffer=10)

 print("* recording, please speak")

 packet_size = int((30/1000)*8000)  # normally 240 packets or 480 bytes

 frames = []

        #while True:
 for i in range(0, 1000):
     packet = stream.read(packet_size)
     ws.send(packet, binary=True)

标签： javascript audio-recording sample-rate

1条回答

\"骚年 ilove

2楼-- · 2020-07-30 01:59

To do realtime downsampling follow these steps:

First get stream instance using this:

const stream = await navigator.mediaDevices.getUserMedia(constraints);

Create media stream source from this stream.

var input = audioContext.createMediaStreamSource(stream);

Create script Processor so that you can play with buffers. I am going to create a script processor which takes 4096 samples from the stream at a time, continuously, has 1 input channel and 1 output channel.
```
var scriptNode = audioContext.createScriptProcessor(4096, 1, 1);
```
Connect your input with scriptNode. You can connect script Node to the destination as per your requirement.
```
    input.connect(scriptNode);
    scriptNode.connect(audioContext.destination);
```

Now there is a function onaudioprocess in scriptProcessor where you can do whatever you want with 4096 samples. var downsample will contain (1/sampling ratio) number of packets. floatTo16BitPCM will convert that to your required format since the original data is in 32 bit float format.

   var inputBuffer = audioProcessingEvent.inputBuffer;
    // The output buffer contains the samples that will be modified and played
    var outputBuffer = audioProcessingEvent.outputBuffer;

    // Loop through the output channels (in this case there is only one)
    for (var channel = 0; channel < outputBuffer.numberOfChannels; channel++) {
        var inputData = inputBuffer.getChannelData(channel);
        var outputData = outputBuffer.getChannelData(channel);



        var downsampled = downsample(inputData);
        var sixteenBitBuffer = floatTo16BitPCM(downsampled);
      }

Your sixteenBitBuffer will contain the data you require.

Functions for downsampling and floatTo16BitPCM are explained in this link of Watson API:IBM Watson Speech to Text Api

You won't need MediaRecorder instance. Watson API is opensource and you can look for a better streamline approach on how they implemented it for their use case. You should be able to salvage important functions from their code.

0人赞添加讨论(0) 举报

How to downsample audio recorded from mic realtime

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间