Extract audio frames from AMR-NB file

2019-07-17 06:41发布

I wrote an algorithm to extract each frame from an AMR file. I considered the first 6 bytes of the file to be the header and the following information is audio frames. Each audio frame is composed by frame header and audio data. Frame header tells the frame's size in bytes (using CMR Mode table - http://www.developer.nokia.com/Community/Wiki/AMR_format). The frame size is stored in the first byte of the frame -> second bit to 5-th bit, counting MSB as first bit.

The algorithm does not work and I decided to display on the screen each byte in binary way(0 and 1) and it appears that sometime the frame size number is greater than 7 and the CMR table has only 0...7 values.

Below is CMR table:

CMR      MODE        FRAME SIZE( in bytes )
0 AMR    4.75        13
1 AMR    5.15        14
2 AMR    5.9         16
3 AMR    6.7         18
4 AMR    7.4         20
5 AMR    7.95        21
6 AMR    10.2        27
7 AMR    12.2        32

and my output(each byte from amr file) is:

0 -> 0 0 0 0 0 0 0 0 
1 -> 0 0 0 0 0 0 0 0 
2 -> 0 0 0 0 0 0 0 0 
3 -> 0 0 0 1 1 0 0 0 
4 -> 0 1 1 0 0 1 1 0 
5 -> 0 0 1 0 1 1 1 0 
6 -> 1 0 0 1 1 1 1 0 
7 -> 0 0 0 0 1 1 1 0 
8 -> 1 1 0 0 1 1 0 0 
9 -> 1 1 1 0 0 1 1 0 
10 -> 0 0 0 0 1 1 1 0 
11 -> 0 0 1 0 1 1 0 0 
12 -> 0 0 0 0 0 0 0 0 
13 -> 0 0 0 0 0 0 0 0 
14 -> 0 0 0 0 0 0 0 0 
15 -> 0 0 0 0 0 0 0 0 
16 -> 1 0 0 1 0 1 1 0 
17 -> 1 1 0 0 1 1 1 0 
18 -> 1 1 1 1 0 1 1 0 
19 -> 1 0 1 1 0 1 1 0 
20 -> 1 1 0 0 1 1 0 0 
21 -> 1 1 1 0 0 1 1 0 
22 -> 0 0 0 0 1 1 1 0 
23 -> 0 0 1 0 1 1 0 0 
24 -> 0 0 0 0 0 0 0 0 
25 -> 0 0 0 0 0 0 0 0 
26 -> 0 1 0 0 0 0 0 0 
27 -> 1 0 0 1 1 0 0 0 
28 -> 1 0 1 1 0 1 1 0 
29 -> 1 1 1 1 0 1 1 0 
30 -> 1 1 1 1 0 1 1 0 
31 -> 0 1 1 0 1 1 1 0 
32 -> 0 0 0 0 0 0 0 0 
33 -> 0 0 0 0 0 0 0 0 
34 -> 0 0 0 0 0 0 0 0 
35 -> 0 0 1 1 0 1 1 0 
36 -> 1 0 1 1 0 1 1 0 
37 -> 0 1 1 0 1 1 1 0 
38 -> 0 0 0 1 0 1 1 0 
39 -> 0 0 1 0 0 1 1 0 
40 -> 0 0 0 0 0 0 0 0 

I took byte nr 6: 10011110 -> 0011 is nr 3 and the coresponding CMR value for 3 is 18. I skip 18 bytes and I reach to byte nr. 6+18 = 24: 00000000 - CMR value for 0 is 13 and I skip another 13 bytes -> 24+13=37: 01101110 -> 1101 is 13 WHICH ISN'T IN CMR table

What I'm doing wrong? I suppose the printing in the binary way is correct. Below is the algorithm for reading each frame(not for displaying the binary way):

private void displayNrOfFrames() throws Exception{
        FileInputStream fis = null;

        try {
            fis = new FileInputStream(mFile);
            long result = fis.skip(6);
            if(result != 6){
                throw new Exception("Could not skip first 6 bytes(header) of AMR.");
            }

            int number = 0;
            int bit = 0;
            byte b;
            BitSet bs;
            while((b = Integer.valueOf(fis.read()).byteValue()) != -1){     
                bs = Util.fromByte(b);          
                number = 0;
                //convert bits [1..4] to number
                for (int i = 1; i <= 4; i++) {
                    bit = bs.get(i)? 1:0;
                    number += bit*Math.pow(2, 4 - i);                   
                }
                System.out.println(number);
                if(!CMR_MAP.containsKey(number)){
                    throw new Exception("Could not parse AMR file.");
                }
                //skip the number of bytes of this frame.
                fis.skip(CMR_MAP.get(number));

            }       

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

[EDIT]

It appears that I'm doing wrong the conversion from byte to BitSet and then cause the algorithm to fail. At byte nr.6 it should be represented the number 121, but is represented by mistake the nr 158. Also the binary output is wrong since it use the same conversion. I didn't checked the conversion method(which I didn't posted here). Sorry for disturbing.

标签: java audio amr 3gp
1条回答
萌系小妹纸
2楼-- · 2019-07-17 07:10

I hope I am not too late with this reply.

First things first: from the same reference you can see that the first 6 bytes (the file header) should be 0x23, 0x21, 0x41, 0x4D, 0x52, 0x0A. This is a constant value and should always be there. If it is not present then the file is probably corrupted and should not be used. So you should not blindly skip the first 6 bytes.

Now, AMR codec supports DTX (discontinuous transmission). DTX is nothing but a way of saving bandwidth by producing less data when vocoder detects silence. Your amr parser should be ready to expect DTX. For AMR-NB (amr narrow band or simply amr) DTX is signalled using mode 8. So your CMR Map should contain the below entry

8 AMR SID 6 (SID is silence indicator...indicates that silence period is starting)

After SID, there will be actual silence frames which will be 1 byte in length (just the header...NO DATA), so you should have entry for

15 AMR NO_DATA 1

Modes 9-11 should be discarded. And modes 12-14 are reserved for future use (generally these are also discarded). All the above information has been given keeping in mind that single channel AMR is being used.

In the prints you have pasted

6 -> 1 0 0 1 1 1 1 0

This is supposed to be the AMR Toc header

    0 1 2 3 4 5 6 7
   +-+-+-+-+-+-+-+-+
   |F|  FT   |Q|P|P|
   +-+-+-+-+-+-+-+-+

For storage, F bit should be 0 but in your example it is 1. And last two bits (which are padding bits) must be zero but in your example these are not 0. I believe your example is not telling the full story here.

查看更多
登录 后发表回答