I'm attempting to stream a H.264 video feed to a web browser. Media Foundation is used for encoding a fragmented MPEG4 stream (MFCreateFMPEG4MediaSink
with MFTranscodeContainerType_FMPEG4
, MF_LOW_LATENCY
and MF_READWRITE_ENABLE_HARDWARE_TRANSFORMS
enabled). The stream is then connected to a web server through IMFByteStream
.
Streaming of the H.264 video works fine when it's being consumed by a <video src=".."/>
tag. However, the resulting latency is ~2sec, which is too much for the application in question. My suspicion is that client-side buffering causes most of the latency. Therefore, I'm experimenting with Media Source Extensions (MSE) for programmatic control over the in-browser streaming. Chrome does, however, fail with the following error when consuming the same MPEG4 stream through MSE:
Failure parsing MP4: TFHD base-data-offset not allowed by MSE. See https://www.w3.org/TR/mse-byte-stream-format-isobmff/#movie-fragment-relative-addressing
mp4dump of a moof/mdat fragment in the MPEG4 stream. This clearly shows that the TFHD contains an "illegal" base data offset
parameter:
[moof] size=8+200
[mfhd] size=12+4
sequence number = 3
[traf] size=8+176
[tfhd] size=12+16, flags=1
track ID = 1
base data offset = 36690
[trun] size=12+136, version=1, flags=f01
sample count = 8
data offset = 0
[mdat] size=8+1624
I'm using Chrome 65.0.3325.181 (Official Build) (32-bit), running on Win10 version 1709 (16299.309).
Is there any way of generating a MSE-compatible H.264/MPEG4 video stream using Media Foundation?
Status Update:
Based on roman-r advise, I managed to fix the problem myself by intercepting the generated MPEG4 stream and perform the following modifications:
- Modify Track Fragment Header Box (tfhd):
- remove
base_data_offset
parameter (reduces stream size by 8bytes)- set
default-base-is-moof
flag- Add missing Track Fragment Decode Time (tfdt) (increases stream size by 20bytes)
- set
baseMediaDecodeTime
parameter- Modify Track fragment Run box (trun):
- adjust
data_offset
parameter
The field descriptions are documented in https://www.iso.org/standard/68960.html (free download).
Switching to MSE-based video streaming reduced the latency from ~2.0 to 0.7 sec. Unfortunately, this is still too much for my needs. The main source of remaining latency appear to be caused by the bundling of 8 frames/samples in each MP4 fragment. I don't know how to fix this.
There's a sample implementation available on https://github.com/forderud/AppWebStream
The problem was solved by following roman-r's advise, and modifying the generated MPEG4 stream. See answer above.
Another way to do this is again using the same code @Fredrik mentioned but I write my own IMFByteStream and and I check the chunks written to the IMFByteStream. FFMpeg writes the atoms almost once at a time. So you can check the atom name and do the mods. It is the same thing. I wish there was an MSE compliant windows sinker.
Is there one that can generate .ts files for HLS?