Interperating AudioBuffer.mData to display audio v

2019-09-01 13:15发布

问题:

I am trying to process audio data in real-time so that I can display an on-screen spectrum analyzer/visualization based on sound input from the microphone. I am using AVFoundation's AVCaptureAudioDataOutputSampleBufferDelegate to capture the audio data, which is triggering the delgate function captureOutput. Function below:

func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) {

    autoreleasepool {

        guard captureOutput != nil,
            sampleBuffer != nil,
            connection != nil,
            CMSampleBufferDataIsReady(sampleBuffer) else { return }

        //Check this is AUDIO (and not VIDEO) being received
        if (connection.audioChannels.count > 0)
        {
            //Determine number of frames in buffer
            var numFrames = CMSampleBufferGetNumSamples(sampleBuffer)

            //Get AudioBufferList
            var audioBufferList = AudioBufferList(mNumberBuffers: 1, mBuffers: AudioBuffer(mNumberChannels: 0, mDataByteSize: 0, mData: nil))
            var blockBuffer: CMBlockBuffer?
          CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(sampleBuffer, nil, &audioBufferList, MemoryLayout<AudioBufferList>.size, nil, nil, UInt32(kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment), &blockBuffer)

            let audioBuffers = UnsafeBufferPointer<AudioBuffer>(start: &audioBufferList.mBuffers, count: Int(audioBufferList.mNumberBuffers))

            for audioBuffer in audioBuffers {
                let data = Data(bytes: audioBuffer.mData!, count: Int(audioBuffer.mDataByteSize))

                let i16array = data.withUnsafeBytes {
                    UnsafeBufferPointer<Int16>(start: $0, count: data.count/2).map(Int16.init(bigEndian:))
                }

                for dataItem in i16array
                {
                    print(dataItem)
                }

            }

        }
    }
}

The code above prints positive and negative numbers of type Int16 as expected, but need help in converting these raw numbers into meaningful data such as power and decibels for my visualizer.

回答1:

I was on the right track... Thanks to RobertHarvey's comment on my question - Use of the Accelerate Framework's FFT calculation functions is required to achieve a spectrum analyzer. But even before I could use these functions, you need to convert your raw data into an Array of type Float as many of the functions require a Float array.

Firstly, we load the raw data into a Data object:

//Read data from AudioBuffer into a variable
let data = Data(bytes: audioBuffer.mData!, count: Int(audioBuffer.mDataByteSize))

I like to think of a Data object as a "list" of 1-byte sized chunks of info (8 bits each), but if I check the number of frames I have in my sample and the total size of my Data object in bytes, they don't match:

//Get number of frames in sample and total size of Data
var numFrames = CMSampleBufferGetNumSamples(sampleBuffer) //= 1024 frames in my case
var dataSize = audioBuffer.mDataByteSize //= 2048 bytes in my case

The total size (in bytes) of my data is twice the number of frames I have in my CMSampleBuffer. This means that each frame of audio is 2 bytes in length. In order to read the data meaningfully, I need to convert my Data object which is a "list" of 1-byte chunks into an array of 2-byte chunks. Int16 contains 16 bits (or 2 bytes - exactly what we need), so lets create an Array of Int16:

//Convert to Int16 array
let samples = data.withUnsafeBytes {
    UnsafeBufferPointer<Int16>(start: $0, count: data.count / MemoryLayout<Int16>.size)
}

Now that we have an Array of Int16, we can convert it to an Array of Float:

//Convert to Float Array
let factor = Float(Int16.max)
var floats: [Float] = Array(repeating: 0.0, count: samples.count)
for i in 0..<samples.count {
    floats[i] = Float(samples[i]) / factor
}

Now that we have our Float array, we can now use the Accelerate Framework's complex math to convert the raw Float values into meaningful ones like magnitude, decibels etc. Link to documentation:

Apple's Accelerate Framework

Fast Fourier Transform (FFT)

I found Apple's documentation rather overwhelming. Luckily, I found a really good example online which I was able to re-purpose for my needs, called TempiFFT. Implementation as follows:

//Initiate FFT
let fft = TempiFFT(withSize: numFrames, sampleRate: 44100.0)
fft.windowType = TempiFFTWindowType.hanning

//Pass array of Floats
fft.fftForward(floats)

//I only want to display 20 bands on my analyzer
fft.calculateLinearBands(minFrequency: 0, maxFrequency: fft.nyquistFrequency, numberOfBands: 20)

//Then use a loop to iterate through the bands in your spectrum analyzer
var magnitudeArr = [Float](repeating: Float(0), count: 20)
var magnitudeDBArr = [Float](repeating: Float(0), count: 20)
for i in 0..<20
{
    var magnitudeArr[i] = fft.magnitudeAtBand(i)
    var magnitudeDB = TempiFFT.toDB(fft.magnitudeAtBand(i))
    //..I didn't, but you could perform drawing functions here...
}

Other useful references:

Converting Data into Array of Int16

Converting Array of Int16 to Array of Float