Algorithm and package to modify the pitch of the s

2019-06-09 16:50发布

问题:

I want to create an audio file using the existing audio file by which I can modify the pitch of the audio for different durations of the file. Like if the file is of 36sec then I want to modify the pitch for 1st 2 sec with some value then from 6th sec to 9th sec some other value and so on ..

Basically, I am trying to modify the audio file based on the text message that user gives like say if user inputs "kill bill" , according to each character in the message k,i,l,b .. i have taken an array which stores different durations and like that I have the table for 26 alphabets a,b,c,d,... and so on. Based on these durations, I am trying to modify the file for these particular durations.. the issue is that I don't really have a very good hands-on over the audio and I even tried dong the same in java but unable to do so.

If anyone could suggest some other parameter that could be changed audio file without making the change much noticeable... suggestions are welcome.

I am referring to these values .. although the code is in java but just ignore that .. i will transform that later in python .. values are in milliseconds public static void convertMsgToAudio(String msg){

        int len = msg.length();
        duration = new double[len];
        msg = msg.toUpperCase();
        System.out.println("Msg 2 : " + msg);

        int i;
        //char ch;
        for(i=0;i<msg.length();i++){

            if(msg.charAt(i) == 'A'){
                duration[i] = 50000;
            }
            else if (msg.charAt(i) == 'B'){
                duration[i] = 100000; // value in milliseconds 
            }
            else if (msg.charAt(i) == 'C'){
                duration[i] = 150000;
            }
            else if (msg.charAt(i) == 'D'){
                duration[i] = 200000;               
            }
            else if (msg.charAt(i) == 'E'){
                duration[i] = 250000;
            }
            else if (msg.charAt(i) == 'F'){
                duration[i] = 300000;
            }
            else if (msg.charAt(i) == 'G'){
                duration[i] = 350000;
            }
            else if (msg.charAt(i) == 'H'){
                duration[i] = 400000;
            }
            else if (msg.charAt(i) == 'I'){
                duration[i] = 450000;
            }
            else if (msg.charAt(i) == 'J'){
                duration[i] = 500000;
            }
            else if (msg.charAt(i) == 'K'){
                duration[i] = 550000;
            }
            else if (msg.charAt(i) == 'L'){
                duration[i] = 600000;
            }
            else if (msg.charAt(i) == 'M'){
                duration[i] = 650000;
            }
            else if (msg.charAt(i) == 'N'){
                duration[i] = 700000;
            }
            else if (msg.charAt(i) == 'O'){
                duration[i] = 750000;
            }
            else if (msg.charAt(i) == 'P'){
                duration[i] = 800000;
            }
            else if (msg.charAt(i) == 'Q'){
                duration[i] = 850000;
            }
            else if (msg.charAt(i) == 'R'){
                duration[i] = 900000;
            }
            else if (msg.charAt(i) == 'S'){
                duration[i] = 950000;
            }
            else if (msg.charAt(i) == 'T'){
                duration[i] = 1000000;
            }
            else if (msg.charAt(i) == 'U'){
                duration[i] = 1100000;
            }
            else if (msg.charAt(i) == 'V'){
                duration[i] = 1200000;
            }
            else if (msg.charAt(i) == 'W'){
                duration[i] = 1300000;
            }
            else if (msg.charAt(i) == 'X'){
                duration[i] = 1400000;
            }
            else if (msg.charAt(i) == 'Y'){
                duration[i] = 1500000;
            }
            else if (msg.charAt(i) == 'Z'){
                duration[i] = 1600000;
            }

        }

    }

Now, I am trying to do the same in python . Pleas help.. I am very new to this concept but I think I am trying something new which should always be encouraged.

This is the first time I am facing so many issues with this concept .. so, Please help.

回答1:

A simple way is to work on raw PCM data directly; in this format the audio data is just a sequence of -32768...32767 values stored as 2 bytes per entry (assuming 16-bit signed, mono) sampled at regular intervals (e.g. 44100Hz).

To alter the pitch you can just "read" this data faster e.g. at 45000Hz or 43000Hz and this is easily done with a resampling procedure. For example

 import struct
 data = open("pcm.raw", "rb").read()
 parsed = struct.unpack("%ih" % (len(data)//2), data)
 # Here parsed is an array of numbers

 pos = 0.0     # position in the source file
 speed = 1.0   # current read speed = original sampling speed
 result = []

 while pos < len(parsed)-1:
     # Compute a new sample (linear interpolation)
     ip = int(pos)
     v = int(parsed[ip] + (pos - ip)*(parsed[ip+1] - parsed[ip]))
     result.append(v)

     pos += speed     # Next position
     speed += 0.0001  # raise the pitch

 # write the result to disk
 open("out.raw", "wb").write(struct.pack("%ih" % len(result)), result)

This is a very very simple approach to the problem, note however for example that increasing the pitch will shorten the length, to avoid this more sophisticated math is needed than just interpolating.

I used approach this for example to raise by one tone a song over its length (I wanted to see if this was noticeable... it isn't).