TL;DR: I want to read raw h264 streams from AVI/MP4 files, even broken/incomplete.
Almost every document about h264 tells me that it consists of NAL packets. Okay. Almost everywhere told to me that the packet should start with a signature like 00 00 01
or 00 00 00 01
. For example, https://stackoverflow.com/a/18638298/8167678, https://stackoverflow.com/a/17625537/8167678
The format of H.264 is that it’s made up of NAL Units, each starting
with a start prefix of three bytes with the values 0x00, 0x00, 0x01
and each unit has a different type depending on the value of the 4th
byte right after these 3 starting bytes. One NAL Unit IS NOT one frame
in the video, each frame is made up of a number of NAL Units.
Okay.
I downloaded random_youtube_video.mp4 and strip out one frame from it:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.avi
And got:
Red part - this is part of AVI container, other - actual data.
As you can see, here I have 00 00 24 A9
instead of 00 00 00 01
This AVI file plays perfectly
I do same for mp4 container:
As you can see, here exact same bytes.
This MP4 file plays perfectly
I try to strip out raw data:
ffmpeg -i pic.avi -c copy pic.h264
This file can't play in VLC or even ffmpeg, which produced this file, can't parse it:
I downloaded mp4 stream analyzer and got:
MP4Box
tells me:
Cannot find H264 start code
Error importing pic.h264: BitStream Not Compliant
It very hard to learn internals of h264, when nothing works.
So, I have questions:
- What actual data inside mp4?
- What I must read to decode that data (I mean different annex-es)
- How to read stream and get decoded image (even with ffmpeg) from this "broken" raw stream?
UPDATE:
It seems bug in ffmpeg:
When I do double conversion:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.mp4
ffmpeg pic.mp4 -c copy pic.h264
But when I convert file directly:
ffmpeg -ss 10 -i random_youtube_video.mp4 -frames 1 -c copy pic.h264
I have NALs signatures and one extra NAL unit. Other bytes are same (selected).
This is bug?
UPDATE
Not, this is not bug, U must use option -bsf h264_mp4toannexb to save stream as "Annex B" format (with prefixes)
"I want to read raw h264 streams from AVI files, even broken/incomplete."
"Almost everywhere told to me that the packet should start with a signature like :
00 00 01
or 00 00 00 01
"
"...As you can see, here I have 00 00 24 A9
instead of 00 00 00 01
"
Your H264 is in AVCC format which means it uses data sizes (instead of data start codes). It is only Annex-B that will have your mentioned signature as start code.
You seek frames, not by looking for start codes, but instead you just do skipping by frame sizes to reach the final correct offset of a (requested) frame...
AVI processing :
Read size (four) bytes (32-bit integer, Little Endian).
Extract the next following bytes up to size amount.
This is your H.264 frame (in AVCC format), decode the bytes to view image.
To convert into Annex-B, try replacing first 4 bytes of H.264 frame bytes with 00 00 00 01
.
Consider your shown AVI bytes (see first picture) :
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00 00 00 00 4C 49 53 54 BA 24 00 00 6D 6F 76 69 ....LISTº$..movi
30 30 64 63 AD 24 00 00 00 00 24 A9 65 88 84 27 00dc.$....$©eˆ„'
C7 11 FE B3 C7 83 08 00 08 2A 7B 6E 59 B5 71 E1 Ç.þ³Çƒ...*{nYµqá
E3 9C 0E 73 E7 10 50 00 18 E9 25 F7 AA 7D 9C 30 ãœ.sç.P..é%÷ª}œ0
E6 2F 0F 20 00 3A 64 AA CA 5E 4F CA FF AE 20 04 æ/. .:dªÊ^OÊÿ® .
07 81 40 00 48 00 0A 28 71 21 84 48 06 18 90 0C ..@.H..(q!„H....
31 14 57 9E 7A CD 63 A0 E0 9B 96 69 C5 18 AE F2 1.WžzÍc à›–iÅ.®ò
E6 07 02 29 01 20 10 70 A1 0F 8C BC 73 F0 78 FA æ..). .p¡.Œ¼sðxú
9E 1D E1 C2 BF 8C 62 CE CE AC 14 5A A4 E1 45 44 ž.á¿ŒbÎά.Z¤áED
38 38 85 DB 12 57 3E F6 E0 FB AE 03 04 21 62 8D 88…Û.W>öàû®..!b.
F6 F1 1E 37 1C A2 FF 75 1C F1 02 66 0C 92 07 06 öñ.7.¢ÿu.ñ.f.’..
15 7C 90 15 6F 7D FC BD 13 1E 2B 0C 14 3C 0C 00 .|..o}ü½..+..<..
B0 EA 6F 53 B4 98 D7 80 7A 68 3E 34 69 20 D2 FA °êoS´˜×€zh>4i Òú
F0 91 FC 75 C6 00 01 18 C0 00 3B 9A C5 E2 7D BF ð‘üuÆ...À.;šÅâ}¿
Some explanation :
Ignore leading multiple 00
bytes.
4C 49 53 54 D6 3C 00 00 6D 6F 76 69
including 30 30 64 63
= AVI "List" header.
AD 24 00 00
== decimal 9389
is AVI's own size of H264 item (must read in Little Endian).
Notice that the AVI bytes include...
- a note of item's total size (AD 24 00 00
... or reverse for Little Endian : 00 00 24 AD
)
- followed by item data (00 00 24 A9 65 88 84 27 ... etc ... C5 E2 7D BF
).
This size includes both the 4 bytes of the AVI's"size" entry + expected bytes length of the item's own bytes. Can be written simply as:
AVI_Item_Size = ( 4 + item_H264_Frame.length );
H.264 video frame bytes in AVI :
Next follows the item data, which is the H.264 video frame. By sheer coincidence of formats/bytes layout, it too holds a 4-byte entry for data's size (since your H264 is in AVCC format, if it was Annex-B then you would be seeing start code bytes here instead of size bytes).
Unlike AVI bytes, these H264 size bytes are written in Big Endian format.
00 00 24 A9
= size of bytes for this video frame (instead of start code : 00 00 00 01
).
65 88 84 27 C7 11 FE B3 C7
= H.264 keyframe (always begins X5, where the X
value is based on other settings).
Remember after four size bytes (or even start codes) if followed by...
- byte
X5
= keyframe (IDR), example byte 65
.
- byte
X1
= P or B frame, example byte 41
.
- byte
X6
= SEI (Supplemental Enhancement Information).
- byte
X7
= SPS (Sequence Parameter Set).
- byte
X8
= PPS (Picture Parameter Set).
- bytes
00 00 00 X9
= Access unit delimiter.
You can find the H.264 if you search for exact same bytes within AVI file. See third picture, these are your H.264 bytes (they are cut & pasted into the AVI container).
Sometimes a frame is sliced into different NAL units. So if you extract a key frame and it only shows 1/2 or 1/3 instead of full image, just grab next one or two NAL and re-try the decode.