Based on http://id3.org/id3v2.3.0 specification, the layout of the frame header is:
Frame ID $xx xx xx xx (four characters)
Size $xx xx xx xx
Flags $xx xx
But same page just couple of lines below that says that frames that allow different types of text encoding have a text encoding description byte directly after the frame size. If ISO-8859-1
is used this byte should be $00
, if Unicode
is used it should be $01
.
This is confusing, as the flags (2 bytes) should be directly after the frame size information, so I would expect the encoding byte to be after the flags information.
So now what is correct?
Frame ID $xx xx xx xx (four characters)
Size $xx xx xx xx
Flags $xx xx
Encoding $xx
Text
or
Frame ID $xx xx xx xx (four characters)
Size $xx xx xx xx
Encoding $xx
Flags $xx xx
Text
A frame header is 10-byte long. 4 bytes for UID 4 bytes for length of frame (header excluded) 2 bytes for flags. Any other info will be found in the frame itself, not its header.
The wording sure is confusing.
What is meant is that where you expect to read a string, the first byte tells you what to expect. $00 means ISO-8859-1 or one byte encoding $01 means Unicode or 2-byte encoding. $01 is followed by either FF FE or FE FF to inform on which the Most Significant byte is.
I'd advise you to use an hexa editor on some mp3 files and dissect them
I think this might actually be a
mistakecase of bad wording in the spec. I found two diagrams in the ID3v2 Chapter Frame Addendum showing examples of complete headers. That document describes two newly introduced frame types, which are not interesting to the question at hand. But fortunately, it also contains examples of embedded 'Title/Songname/Content description'-frame (TIT2
) and 'Subtitle/Description refinement'-frame (TIT3
), which are both text frames*:According to the diagram, the Title frame (ID: TIT2) has the following structure: First the frame header:
which is then directly followed by ID-dependent fields:
This layout makes the most sense to me. If you still have doubts about the correct layout, you could check out the source of one of the existing implementations.
Sidenote: in the ID3v2.4.0 specification they changed the confusing sentence to.
* Only frames that allow different types of text encoding have a text encoding description byte.
Unsurprisingly, most of these are text frames