Recognize and skip invalid (or proprietary?) MP3 f

2019-08-17 07:41发布

I am writing an MP3 decoder (not to re-play any sound, but to analyze frequencies).

I can successfully identify ID3v1 and ID3v2 tags and skip them in their whole length (including the ID3v2 NULs padding), for I am not interested in this metadata. I'm just after the frequencies.

I also can obtain and correctly interpret all MP3 frame headers (doing all available tests, which aren't too many). Small excerpt from the immediate window telling me what the frame is about:

...
2131 until pos. 2226975 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
2132 until pos. 2228020 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
2133 until pos. 2229065 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
2134 until pos. 2230110 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
2135 until pos. 2231155 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
2136 until pos. 2232200 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
2137 until pos. 2233245 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
2138 until pos. 2234290 FFFBE264, EMp3Vrs1, EMp3LayIII, 320, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 1009 B
...

If I take the MP3 frames in their whole length (incl. CRC if any, side information, Huffman-encoded data, and ancillary data if any) and write those back into a FileStream object, naming it .mp3, I can perfectly well listen to the title.

This works for MP3 files stored locally or somewhere in the LAN, without ever encountering a bad header, not one false alarm is ever given. Success.

Enters the Web stream. If I feed this to my FileStream object, all goes well for a few hundred frames, but all at a sudden a lot of invalid frames are transmitted:

...
1291 FFFB9264, EMp3Vrs1, EMp3LayIII, 128, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 382 B
1292 FFFB9244, EMp3Vrs1, EMp3LayIII, 128, 44100, 1, EMp3ChMJointStereo, 0, CRC: 0, Data: 382 B
1293 FFFB9264, EMp3Vrs1, EMp3LayIII, 128, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 382 B
1294 FFFB9264, EMp3Vrs1, EMp3LayIII, 128, 44100, 1, EMp3ChMJointStereo, 2, CRC: 0, Data: 382 B
34B5FF96 is not a valid header
FF96C517 is not a valid header
FFFFFFF8 is not a valid header
FFFFF8F1 is not a valid header
FFF8F1E1 is not a valid header
1295 FFF32191, EMp3Vrs2, EMp3LayIII, 16, 22050, 0, EMp3ChMDualChannel, 1, CRC: 0, Data: 68 B
There are 136 B of pre-header-data
...

These invalid headers are followed by a variable length sequence of unrecognized bytes before the next valid header appears.

Here's a hex dump of the stream part in question:

0008-40a0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-40b0:  00 00 00 00-00 00 00 34-b5 ff 96 c5-17 59 00 ca  .......4 .....Y.. <- 34B5FF96, FF96C517
0008-40c0:  00 a0 00 67-00 08 00 4f-00 1e 00 1f-00 e2 00 b3  ...g...O ........
0008-40d0:  00 ac 00 cf-00 69 00 bf-00 ff ff ff-f8 f1 e1 d0  .....i.. ........ <- FFFFFFF8, FFFFF8F1, FFF8F1E1
0008-40e0:  00 a0 00 e0-00 78 00 5d-00 c3 00 00-00 09 00 83  .....x.] ........
0008-40f0:  00 20 00 04-00 80 00 dd-00 d0 00 45-00 08 00 80  ........ ...E....
0008-4100:  00 26 00 96-00 c5 00 ed-00 18 00 9c-00 a7 00 a9  .&...... ........
0008-4110:  00 f5 00 1c-00 81 00 43-00 d8 00 61-00 78 00 ed  .......C ...a.x..
0008-4120:  00 d0 00 91-00 7f 00 a8-00 93 00 2a-00 2e 00 a2  ........ ...*....
0008-4130:  00 20 00 ee-00 a3 00 e9-00 35 00 75-00 77 00 ff  ........ .5.u.w.. <- FFF32191
0008-4140:  f3 21 91 19-0c da 9d 48-96 be 61 e2-cc db 5d d1  .!.....H ..a...].
0008-4150:  cd 40 8b bb-a3 8a 22 9e-26 65 36 aa-47 90 63 e2  .@....". &e6.G.c.
0008-4160:  46 72 21 fe-cb 78 0a 08-f1 48 24 da-89 25 55 78  Fr!..x.. .H$..%Ux
0008-4170:  6a 39 d2 65-68 11 14 6d-41 bb b5 45-91 05 3d b0  j9.eh..m A..E..=.
0008-4180:  03 18 4b 39-fb c2 dd 01-8e 95 15 34-39 93 b9 1f  ..K9.... ...49...
0008-4190:  47 c4 bf d8-61 04 85 08-a0 41 8c ca-7b b9 19 aa  G...a... .A..{...
0008-4197:  93 05 18 50-5c 51 d7                             ...P\Q.

I assume, that TuneIn does transport some metadata here, but I am not able to figure out which protocol to use, if any.

The problem is, that these blocks do obviously span more bytes than I think they do, because the next header I deem to be valid is an invalid header in disguise (FFF32191 does not fit the 128 kbps 44100 Hz JointStereo model applied in the other frames), and thus is likely to still belong to that possible meta data chunk.

I am quite confident about this, because when saving also these MP3 frames, as I did with the local files, they play just fine (as if I was recording from the Web, so with 128 kbps only), until the errors appear after several hundred frames. Then intermittent noise sets in, squeeking and whistling all few deciseconds.

The frustrating thing is: if I play the same address from within a browser, it just plays fine.

My question: What do those browsers know which I am not able to figure out? (I just want to skip the correct number of bytes to obtain the next valid frame.)

(At one time I was such frustrated, that I was thinking completely unjustifiedly, that TuneIn does insert these bytes malevolently to inhibit people like me from recording "their" music. But then: the browsers know how to deal with these streams, so... apologies TuneIn.)


Edit

Analyzing the dump a bit more back, I found an interesting content, namely an ASCII string reading "LAME3.98.4".

0008-3d70:  9c 5f 26 ff-fb 92 64 fb-80 03 07 64-5d eb 0b 39  ._&...d. ...d]..9 <- FFFB9264 (frame 1293)
0008-3d80:  fe 60 89 ab-1d 41 87 1e-0a e1 2f 75-e6 24 a7 e9  .`...A.. ../u.$..
0008-3d90:  75 a6 2d 28-f2 9a ba 2c-23 07 79 68-e8 94 18 a4  u.-(..., #.yh....
0008-3da0:  68 d4 08 0e-f0 48 35 67-7e d2 ef 9e-73 13 ba a5  h....H5g ~...s...
0008-3db0:  fc f2 db d9-07 28 6c ce-3a 15 cb cf-39 af 99 5d  .....(l. :...9..]
0008-3dc0:  25 22 89 19-7c c4 22 a2-3b 51 e9 a7-ff ff ff f4  %"..|.". ;Q......
0008-3dd0:  59 83 1a 84-53 85 d6 99-25 20 49 8b-18 7f 25 5e  Y...S... %.I...%^
0008-3de0:  cd 41 69 75-e5 86 d6 8e-39 a3 96 1c-45 9e 69 66  .Aiu.... 9...E.if
0008-3df0:  d5 a6 b4 6d-e9 99 46 96-eb a3 73 74-4f de f2 96  ...m..F. ..stO...
0008-3e00:  34 48 60 70-10 5c 5f d9-2e dd af 44-2c c5 5a 48  4H`p.\_. ...D,.ZH
0008-3e10:  51 64 63 0d-92 af 62 0f-bb 55 ae b4-9d d1 8a f6  Qdc...b. .U......
0008-3e20:  66 41 e8 c3-68 54 ae 6d-0e 13 32 aa-bd ff ff f1  fA..hT.m ..2.....
0008-3e30:  56 00 4b 2a-24 49 25 15-98 77 98 71-36 d7 2d c2  V.K*$I%. .w.q6.-.
0008-3e40:  29 ce 8a b5-1b 72 84 e9-3f 03 4a da-74 e4 66 29  )....r.. ?.J.t.f)
0008-3e50:  fc 7d e7 fd-53 68 f4 7e-3b bb 2e 1b-97 e1 f1 8a  .}..Sh.~ ;.......
0008-3e60:  ba fd da 8b-8e 73 96 3c-20 40 ce 13-53 20 f0 6a  .....s.< .@..S..j
0008-3e70:  6d 9d cf c6-fa 84 f1 48-84 67 ef 51-af 8c ec 9f  m......H .g.Q....
0008-3e80:  7f ff ce 15-32 ca b1 ac-f5 e5 48 e8-0c 38 23 c3  ....2... ..H..8#.
0008-3e90:  05 02 b5 55-4c 41 4d 45-33 2e 39 38-2e 34 55 55  ...ULAME 3.98.4UU <- LAME3.98.4
0008-3ea0:  55 55 08 83-c5 04 58 55-e4 b3 30 3a-c9 da 85 3d  UU....XU ..0:...=
0008-3eb0:  11 80 7d 6d-62 41 5b d8-42 9a c2 a0-56 72 77 83  ..}mbA[. B...Vrw.
0008-3ec0:  4a d4 79 4b-28 de 4c 7f-2d 2c 7d b9-e0 bb 1d d8  J.yK(.L. -,}.....
0008-3ed0:  b6 fd b6 f3-ed 9a ba 09-49 00 6d 5f-fd 8a 77 cf  ........ I.m_..w.
0008-3ee0:  df 3f f4 70-3a 29 1c 4a-b7 39 6f 15-8c 74 fa fa  .?.p:).J .9o..t..
0008-3ef0:  f3 be 67 1f-db ae 2e 5e-90 dd 74 9c-ae 76 82 c1  ..g....^ ..t..v..
0008-3f00:  7b 3d 6a 03-05 0e aa a7-41 d6 df ff-ff 14 1a e3  {=j..... A.......
0008-3f10:  d8 a2 52 42-09 ff fb 92-64 f8 00 02-ff 4d 57 e1  ..RB.... d....MW. <- FFFB9264 (frame 1294)
0008-3f20:  e5 35 e0 4e-a9 7b dc 2c-c2 7f cb b5-31 75 a7 95  .5.N.{., ....1u..
0008-3f30:  35 f1 88 25-ec f4 f3 0e-73 a0 c0 6d-ee a0 bf 15  5..%.... s..m....
0008-3f40:  d8 b9 5d 7d-ce d4 c5 84-5a 4a 97 15-ba 22 08 09  ..]}.... ZJ..."..
0008-3f50:  b8 ec e8 3f-b1 22 89 b0-72 6d d7 db-75 b7 3b f4  ...?.".. rm..u.;.
0008-3f60:  b7 56 dd e3-43 0e 36 99-33 00 00 00-00 00 00 00  .V..C.6. 3.......
0008-3f70:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3f80:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3f90:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3fa0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3fb0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3fc0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3fd0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3fe0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-3ff0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4000:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4010:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4020:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4030:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4040:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4050:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4060:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4070:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4080:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-4090:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........
0008-40a0:  00 00 00 00-00 00 00 00-00 00 00 00-00 00 00 00  ........ ........ <- Start of above dump
0008-40b0:  00 00 00 00-00 00 00 34-b5 ff 96 c5-17 59 00 ca  .......4 .....Y.. <- Invalid header

LAME 3.98.4 is dated 2010-04-14. However, what does it do there? Answer: it's normal, see Brad's comment in his answer.

2条回答
唯我独甜
2楼-- · 2019-08-17 08:18

That was a tough one to solve.

When it comes to the part reading in the Huffman-encoded data in its precalculated length (with MP3 using 128 kbps and 44.1 kHz that's 381 or 382 B), I relied on the fact that an IO.Stream's synchronous Read() method in fact is synchronous and waits for the number of requested bytes to be available (because my smart books say, that when less bytes are obtained, the stream has ended). Since the address is a Web stream, the bytes are submitted regularly. Thus I wrote:

            ReDim gabMainData(0 To iDataSize - 1)
            s.Read(gabMainData, 0, iDataSize)

Turns out, that all now and then (first after a few hundred, but up to 1200 frames), Read can return less bytes -- in spite of the Web stream not having ended at all --, leaving many NUL bytes at the end of the appropriately dimmed bytes array.

In these cases, employing an additional ReadByte() will help:

            ReDim gabMainData(0 To iDataSize - 1)
            iNumRead = s.Read(gabMainData, 0, iDataSize)
            Do While iNumRead < iDataSize
                iByte = s.ReadByte()
                If iByte = -1 Then
                    'End of stream activities.
                    ...
                End If
                gabMainData(iNumRead) = CByte(iByte)
                iNumRead += 1
            Loop

The test run after this fix is reporting 100,000 frames now without a single hickup, just as if it was a CD, so I am confident about it.

Going to make this my standard Read() method now.

If there's a better way to do this task synchronously, kindly let me know.

查看更多
混吃等死
3楼-- · 2019-08-17 08:29

I assume, that TuneIn does transport some metadata here, but I am not able to figure out which protocol to use, if any.

This doesn't have anything to do with TuneIn... it's a SHOUTcast server, and it uses ICY-style metadata. In any case, unless you actually request the metadata (with Icy-MetaData: 1 in the HTTP request headers), you won't get it. You'll end up with a normal raw MPEG stream.

If you want to know more about the data, check out my answers here:

I don't see any SHOUTcast-style ICY metadata in your dump.

all goes well for a few hundred frames, but all at a sudden a lot of invalid frames are transmitted

It's interesting that it works initially, but then fails.

Usually for these streams, the first couple frames are wonky as the server doesn't have to know or care about the data flowing through it. It just has a fixed buffer size of 128KB or so, and arbitrarily chunks so that when a client connects, it gets that buffer plus whatever comes after it. That is, the client is "needle dropped" right into the stream and is expected to synchronize itself.

This is done with the sync word 0xFFF* or 0xFFE*. Anything before the sync should be discarded. Any frames requiring an unavailable bit reservoir should be discarded. Eventually after a few frames, you'll have a steady stream of data to decode.

Double check to make sure you don't have a buffer around from a previous file/URL, and then re-sync to the stream on initial connect.

If this isn't the problem, I'm honestly not sure what to suggest other than that some stations use broken codecs. You'd be surprised how many internet radio stations are using copies of LAME from 15 years ago.

I might also suggest, if possible, sticking with an existing codec and doing your analysis after converting it back to normal PCM.

查看更多
登录 后发表回答