I'm using the Bento4 library to mux an Annex B TS (MPEG-2 transport stream) file with my h264 video and AAC audio streams that are being generated from VideoToolbox and AVFoundation respectively, as source data for a HLS (HTTP Live Streaming) stream. This question is not necessarily Bento4-specific: I'm trying to understand the underlying concepts so that I can accomplish the task, preferably by using Apple libs.
So far, I've figured out how to create an AP4_AvcSampleDescription
by getting various kinds of data out of my CMVideoFormatDescriptionRef
, and most importantly by generating an SPS and PPS using index 0 and 1 respectively of CMVideoFormatDescriptionGetH264ParameterSetAtIndex
that I can just stick as byte buffers into Bento4. Great, that's all the header information I need so that I can ask Bento4 to mux video into a ts file!
Now I'm trying to mux audio into the same file. I'm using my CMAudioFormatDescriptionRef
to get the required information to construct my AP4_MpegAudioSampleDescription
, which Bento4 uses to make the necessary QT atoms and headers. However, one if the fields is a "decoder info" byte buffer, with no explanation of what it is, or code to generate one from data. I would have hoped to have a CMAudioFormatDescriptionGetDecoderInfo
or something, but I can't find anything like that. Is there such a function in any Apple library? Or is there a nice spec that I haven't found on how to generate this data?
Or alternatively, am I walking down the wrong path? Is there an easier way to mux ts files from a Mac/iOS code base?
Muxing audio into an MPEG-TS is surprisingly easy, and does not require a complex header like a video stream does! It only requires a 7-byte ADTS header before each sample buffer, before you write it as a PES.
Bento4 only uses the "DecoderInfo" buffer in order to parse it into an
AP4_Mp4AudioDecoderConfig
instance, so that it can extract the information needed for the ADTS header. Instead of being so roundabout in acquiring this data, I made a copy-paste ofAP4_Mpeg2TsAudioSampleStream::WriteSample
that writes aCMSampleBufferRef
. It can easily be generalized for other audio frameworks, but I'll just paste it as-is here for reference:The 'decoder info' byte buffer needed by Bento4 to create a AP4_MpegAudioSampleDescription instance is the codec initialization data, which is codec specific. For AAC-LC audio, it is typically 2 bytes of data (for HE-AAC you would get a few more bytes), the details of which are specified in the AAC spec. For example, a 44.1kHz, stereo, AAC-LC stream will have [0x12,0x10] as init data. In most Apple APIs, this type of codec initialization data is conveyed through what they call 'Magic Cookies'. It is likely that the function CMAudioFormatDescriptionGetMagicCookie will return what you need here.