UDP packet-size/latency tradeoff when streaming au

2019-04-16 00:51发布

I'm building an application that streams live audio over udp online and I want to minimize latency. The audio is sent as its generated, meaning it takes one second to generate one second of audio, it cannot be sent faster than the audio-rate.

My initial idea was to send small packets of compressed audio so the client can begin playback as soon as possible. Using the Opus codec I should be able to send packets as small as 5ms of audio (2.5ms is the minimum), this would mean the user could start playback pretty soon, lets say after 2 such packets have been delivered.

However there is a lot of bandwidth overhead when using such a small packet size. Lets say each 5ms packet of audio is 35 bytes, the ip and udp headers make up a total of 28 bytes, thats a lot of extra data.

My question is, is there any way to send live audio with larger packet sizes but with this low latency? For example, is it possible to begin sending data (partial udp packets) as my application is in the process of generating it, or must it wait before the entire packet's payload has been produced? (the length in bytes would be known in advance).

If so, I could use larger packets but start streaming the data even sooner.

Or is network jitter likely to be so large that I would have to buffer much more than 5ms anyway?

2条回答
别忘想泡老子
2楼-- · 2019-04-16 01:10

I would recommend you use up to 534 bytes. That's the limit if you want to avoid fragmentation and therefore possible data loss on that ground.

查看更多
一夜七次
3楼-- · 2019-04-16 01:12

You will most definitely be buffering more than 5ms. 5ms is an extremely low buffer, even for the playback sound card itself. Only sound devices with special drivers (such as ASIO) are able to get that low, and that is about as low as they go. Are you sending those packets over your own LAN where you can control and prioritize delivery? That is the only way to really guarantee performance. There are layer 2 protocols built specifically for this, such as Ethersound. It depends on what you are building and what your requirements are.

A common buffer size for network software is around 1400-1500 bytes, which is near the maximum that you can send per packet over a typical Ethernet network. This is what I recommend for your application.

查看更多
登录 后发表回答