TarsosDSP Pitch Analysis for Dummies

2020-06-04 05:33发布

问题:

I am woking on a progarm that analyze the pitch of a sound file. I came across a very good API called "TarsosDSP" which offers various pitch analysis. However I am experiencing a lot of trouble setting it up. Can someone show me some quick pointers on how to use this API (espically the PitchProcessor class) please? Some snippets of code would be extremely appreciated because I am really new at sound analysis.

Thanks

EDIT: I found some document at http://husk.eecs.berkeley.edu/courses/cs160-sp14/index.php/Sound_Programming where there are some example code that shows how to setup the PitchProcessor, …

int bufferReadResult = mRecorder.read(mBuffer, 0, mBufferSize);
// (note: this is NOT android.media.AudioFormat)
be.hogent.tarsos.dsp.AudioFormat mTarsosFormat = new be.hogent.tarsos.dsp.AudioFormat(SAMPLE_RATE, 16, 1, true, false);
AudioEvent audioEvent = new AudioEvent(mTarsosFormat, bufferReadResult);
audioEvent.setFloatBufferWithByteBuffer(mBuffer);
pitchProcessor.process(audioEvent);

…I am quite lost, what exactly are mBuffer and mBufferSize? How do I find these values? And where do I input my audio files?

回答1:

The basic flow of audio in the TarsosDSP framework is as follows: the incoming audio stream originating from an audio file or a microphone is read and chopped into frames of e.g. 1024 samples. Each frame travels through a pipeline that modifies or analyses (e.g. pitch analysis) it.

In TarsosDSP the AudioDispatcher is responsible to chop the audio in frames. Also it wraps an audio frame into an AudioEvent object. This AudioEvent object is send through a chain of AudioProcessors.

So in the code you quoted mBuffer is the audio frame, mBufferSize is the size of the buffer in samples. You can choose the buffer size yourself but for pitch detection 2048 samples is reasonable.

For pitch detection you could do something like this with the TarsosDSP library:

   PitchDetectionHandler handler = new PitchDetectionHandler() {
        @Override
        public void handlePitch(PitchDetectionResult pitchDetectionResult,
                AudioEvent audioEvent) {
            System.out.println(audioEvent.getTimeStamp() + " " pitchDetectionResult.getPitch());
        }
    };
    AudioDispatcher adp = AudioDispatcherFactory.fromDefaultMicrophone(2048, 0);
    adp.addAudioProcessor(new PitchProcessor(PitchEstimationAlgorithm.YIN, 44100, 2048, handler));
    adp.run();

In this code first a handler is created which simply prints the detected pitch. The AudioDispatcher is attached to the default microphone and has a buffersize of 2048. An audio processor that detects pitch is added to the AudioDispatcher. The handler is used there as well.

The last line starts the process.