Noise reduction and compression in streaming audio

2019-04-09 19:27发布

问题:

hope you can help. I am recording audio from a microphone and streaming it live across a network. The quality of the samples is 11025hz, 8 bit, mono. Although there is a small delay (1 second), it works great. What I need help with is I am trying to now implement noise reduction and compression, to make the audio quieter and use less bandwidth. The audio samples are stored in a C# array of bytes[], which I am sending/receiving using Socket.

Could anyone suggest how, in C#, to implement compression and noise reduction? I do not mind using a third party library as long as it is free (LGPL license, etc) and can be utilized from C#. However, I would prefer actual working source code examples. Thanks in advance for any suggestion you have.

UPDATE:

I changed the bit size from 8 bit audio to 16 bit audio and the noise problem is fixed. Apprarently 8 bit audio from mic had too low signal-to-noise ratio. Voice sounds great at 11khz, 16 bit mono.

The requirements of this project have changed since I posted this, however. We are now trying to add video as well. I have a callback setup that receives live images every 100ms from a webcam. I need to encode the audio and video, mux them, transmit them on my socket to the server, the server re-transmits the stream to the other client, which receives the stream, demuxes the stream and decodes the audio and video, displays the video in a picture box and outputs the audio to the speaker.

I am looking at ffmpeg to help out with the (de|en)coding/[de]muxing, and I am also looking at SharpFFmpeg as a C# interop library to ffmpeg.

I cannot find any good examples of doing this. I have scoured the Internet all week, with no real luck. Any help you can provide is much appreciated!

Here's some code, including my call back function for the mic recording:

        private const int AUDIO_FREQ = 11025;
        private const int CHANNELS = 1;
        private const int BITS = 16;
        private const int BYTES_PER_SEC = AUDIO_FREQ * CHANNELS * (BITS / 8);
        private const int BLOCKS_PER_SEC = 40;
        private const int BUFFER_SECS = 1;
        private const int BUF_SIZE = ((int)(BYTES_PER_SEC / BLOCKS_PER_SEC * BUFFER_SECS / 2)) * 2; // rounded to nearest EVEN number

        private WaveLib.WaveOutPlayer m_Player;
        private WaveLib.WaveInRecorder m_Recorder;
        private WaveLib.FifoStream m_Fifo;

        WebCam MyWebCam;

        public void OnPickupHeadset()
        {
            stopRingTone();
            m_Fifo = new WaveLib.FifoStream();

            WaveLib.WaveFormat fmt = new WaveLib.WaveFormat(AUDIO_FREQ, BITS, CHANNELS);
            m_Player = new WaveLib.WaveOutPlayer(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferFillEventHandler(PlayerCB));
            m_Recorder = new WaveLib.WaveInRecorder(-1, fmt, BUF_SIZE, BLOCKS_PER_SEC,
                            new WaveLib.BufferDoneEventHandler(RecorderCB));

            MyWebCam = null;
            try
            {
                MyWebCam = new WebCam();                
                MyWebCam.InitializeWebCam(ref pbMyPhoto, pbPhoto.Width, pbPhoto.Height);
                MyWebCam.Start();
            }
            catch { }

        }

        private byte[] m_PlayBuffer;
        private void PlayerCB(IntPtr data, int size)
        {
            try
            {
                if (m_PlayBuffer == null || m_PlayBuffer.Length != size)
                    m_PlayBuffer = new byte[size];

                if (m_Fifo.Length >= size)
                {
                    m_Fifo.Read(m_PlayBuffer, 0, size);
                }
                else
                {
                    // Read what we can 
                    int fifoLength = (int)m_Fifo.Length;
                    m_Fifo.Read(m_PlayBuffer, 0, fifoLength);

                    // Zero out rest of buffer
                    for (int i = fifoLength; i < m_PlayBuffer.Length; i++)
                        m_PlayBuffer[i] = 0;                        
                }

                // Return the play buffer
                Marshal.Copy(m_PlayBuffer, 0, data, size);
            }
            catch { }
        }


        private byte[] m_RecBuffer;
        private void RecorderCB(IntPtr data, int size)
        {
            try
            {
                if (m_RecBuffer == null || m_RecBuffer.Length != size)
                    m_RecBuffer = new byte[size];
                Marshal.Copy(data, m_RecBuffer, 0, size);

                // HERE'S WHERE I WOULD ENCODE THE AUDIO IF I KNEW HOW

                // Send data to server
                if (theForm.CallClient != null)
                {
                    SocketAsyncEventArgs args = new SocketAsyncEventArgs();
                    args.SetBuffer(m_RecBuffer, 0, m_RecBuffer.Length);
                    theForm.CallClient.SendAsync(args);
                }
            }
            catch { }
        }

        //Called from network stack when data received from server (other client)
        public void PlayBuffer(byte[] buffer, int length)
        {
            try
            {
                //HERE'S WHERE I WOULD DECODE THE AUDIO IF I KNEW HOW

                m_Fifo.Write(buffer, 0, length); 
            }
            catch { }
        }

So where should I go from here?

回答1:

Your goals here are kind of mutually exclusive. The reason your 11025Hz/8bit/Mono WAV files sound noisy (with a tremendous amount of "hiss") is because of their low sample rate and bit resolution (44100Hz/16bit/Stereo is the standard for CD-quality audio).

If you continue recording and streaming at that rate, you are going to have noisy audio - period. The only way to eliminate (or actually just attenuate) this noise would be to up-sample the audio to 44100Hz/16bit and then perform a noise reduction algorithm of some sort on it. This upsampling would have to be performed by the client application, since doing it on the server before streaming means you'd then be streaming audio 8X larger than your original (doing it on the server would also be utterly pointless, since you'd be better off just recording in the denser format in the first place).

What you want to do is to record your original audio in a CD-quality format and then compress it to a standard format like MP3 or Ogg Vorbis. See this earlier question:

What's the best audio compression library for .NET?

Update: I haven't used this, but:

http://www.ohloh.net/p/OggVorbisDecoder

I think you need an encoder, but I couldn't find one for Ogg Vorbis. I think you could try encoding to the WMV format, as well:

http://www.discussweb.com/c-programming/1728-encoding-wmv-file-c-net.html

Update 2: Sorry, my knowledge level of streaming is pretty low. If I were doing something like what you're doing, I would create an (uncompressed) AVI file from the audio and the still images (using avifil32.dll methods via PInvoke) first, then compress it to MPEG (or whatever format is standard - YouTube has a page where they talk about their preferred formats, and it's probably good to use one of these).

I'm not sure if this will do what you need, but this link:

http://csharpmagics.blogspot.com/

using this free player:

http://www.videolan.org/

might work.



回答2:

If you only want to compress the data to limit bandwidth usage you can try using a GZipStream.