How to fragment H264 Packets in RTP compliant with

2019-02-03 13:06发布

I have the FFMPEG streaming baseline h264 video, which I have to encapsulate in RTP and send to SIP phones for their decoding. I am using Linphone with the h264 plugin for Windows and Mirial for the decoding progress. However, sometimes I get a huge frame size (3Kb ~ 9Kb) from the FFMPEG, which obviously doesn't fit in the MTU.

If I send these frames "as is" and trusting IP fragmentation feature, some phones are able to play it well enough, but others choke and can't decode the stream. I think this is because the stream is not compliant with the RFC 3984 that specifies that packets that don't fit in the MTU have to be separated into different NALUs and mark the end of a Frame with the Mark feature of RTP.

How do I know where I can "cut" the I or P frame? I noticed that fragmented h264 packets (the ones without the Mark label) sometimes finish in 0xF8 but couldn't quite get a pattern and in the RFC 3984 which describes how to send these packets over RTP doesn't specify how to do it.

UPDATE: Does anyone know how to tell the X264 library how to generate NALUs of a Max Size? that way i should be able to avoid this problem. Thanks everyone

2条回答
虎瘦雄心在
2楼-- · 2019-02-03 13:47

In x264, I believe the int i_slice_max_size in x264_param_t can be used to control the size. Have a look in x264.h I can't remember where I read this, but the post said this structure member can be used to control the NAL size, but I haven't tried it myself.

int i_slice_max_size; /* Max size per slice in bytes; includes estimated NAL overhead. */

EDIT: I found the source

http://mailman.videolan.org/pipermail/x264-devel/2011-February/008263.html

查看更多
姐就是有狂的资本
3楼-- · 2019-02-03 13:59

As an author to RFC 3984bis (to be RFC 6184), it details exactly how to convert H.264 NALs into RFC 3984 packets. There are 3 modes: 0 (single-NAL), 1 (allows for fragmenting and combining NALs), and 2 (lets you fragment, combine, and interleave the transmission order to change how a burst loss will affect a stream, among other things). See SDP packetization-mode. Only mode 0 is required.

Mode 0 (Single-NAL) requires you either use UDP fragmentation (discouraged) or tell the encoder don't generate NALs larger than MTU-X. You should be able to tell the encoder this.

Mode 1 lets you fragment. See the RFC for how you set up an FU-A packet. The fragmentation info is on the front. You can also use STAPs to aggregate small NALs like SPS and PPS packets sent before IDRs (normally). Each packet requires normal RTP headers with incremented sequence numbers (but the same timestamp).

Mark on the last RTP packet of a frame (not of a fragment or NAL) is expected but you shouldn't count on it.

查看更多
登录 后发表回答