-->

how to read VBR audio in novacaine (as opposed to

2019-03-06 03:40发布

问题:

The creator of novacaine offered example code where audio data is read from a a file and fed to a ring buffer. When the file reader is created though, the output is forced to be PCM:

- (id)initWithAudioFileURL:(NSURL *)urlToAudioFile samplingRate:(float)thisSamplingRate numChannels:(UInt32)thisNumChannels
{

...

    // We're going to impose a format upon the input file
    // Single-channel float does the trick.
    _outputFormat.mSampleRate = self.samplingRate;
    _outputFormat.mFormatID = kAudioFormatLinearPCM;
    _outputFormat.mFormatFlags = kAudioFormatFlagIsFloat;
    _outputFormat.mBytesPerPacket = 4*self.numChannels;
    _outputFormat.mFramesPerPacket = 1;
    _outputFormat.mBytesPerFrame = 4*self.numChannels;
    _outputFormat.mChannelsPerFrame = self.numChannels;
    _outputFormat.mBitsPerChannel = 32;

}

I'm trying to contribute to the novacaine project by allowing it to

  • Read from the iPod library (which can only be accessed via AVAssetReader, rather than the audio file services library)
  • Read and write VBR packets rather than PCM.

So this is what my equivalent function of the above looks like (see the NOTE: parts)

- (id)initWithAudioAssetURL:(NSURL *)urlToAsset samplingRate:(float)thisSamplingRate numChannels:(UInt32)thisNumChannels
{
    self = [super init];
    if (self)
    {

        // Zero-out our timer, so we know we're not using our callback yet
        self.callbackTimer = nil;

        // Open a reference to the audio Asset Track and setup the reader
        self.assetURL = urlToAsset;
        AVURLAsset *songAsset = [AVURLAsset URLAssetWithURL:self.assetURL options:nil];
        NSError * error = nil;
        AVAssetReader* reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];
        AVAssetTrack* track = [songAsset.tracks objectAtIndex:0];

        //NOTE: we use the track's native settings here, as opposed to forcing it to be PCM
        //like the example above
        _readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:track
                                                                  outputSettings:NULL];

        _nativeTrackASBD = [self getTrackNativeSettings:track];

        [reader addOutput:_readerOutput];
        [reader startReading];


        // Set a few defaults and presets
        self.samplingRate = thisSamplingRate;
        self.numChannels = thisNumChannels;
        self.latency = .011609977; // 512 samples / ( 44100 samples / sec ) default

        // Arbitrary buffer sizes that don't matter so much as long as they're "big enough"
        self.outputBufferSize = 100000; //buffer sample sizes vary around 60-70k, we keep it @ 100k to be safe
        self.numSamplesReadPerPacket = 8192;
        self.desiredPrebufferedSamples = self.numSamplesReadPerPacket*2;

        //NOTE: these buffers are float, where as the above audio code is in SInt16
        self.outputBuffer = (float *)calloc(2*self.samplingRate, sizeof(float));
        self.holdingBuffer = (float *)calloc(2*self.samplingRate, sizeof(float));


        // Allocate a ring buffer (this is what's going to buffer our audio)
        ringBuffer = new RingBuffer(self.outputBufferSize, self.numChannels);


        // Fill up the buffers, so we're ready to play immediately
        [self bufferNewAudioFromAsset];

    }
    return self;
}

Looking at the code, it seems that everything is made in float (the audio buffers, the output format etc).. Is there a reason for this? (Keep in mind that the iOS audio canonical format is in SInt16, not float).. for example see in the Novocaine::renderCallback function:

else if ( sm.numBytesPerSample == 2 ) // then we need to convert SInt16 -> Float (and also scale)
{
    float scale = (float)INT16_MAX;
    vDSP_vsmul(sm.outData, 1, &scale, sm.outData, 1, inNumberFrames*sm.numOutputChannels);

    for (int iBuffer=0; iBuffer < ioData->mNumberBuffers; ++iBuffer) {  

        int thisNumChannels = ioData->mBuffers[iBuffer].mNumberChannels;

        for (int iChannel = 0; iChannel < thisNumChannels; ++iChannel) {
            vDSP_vfix16(sm.outData+iChannel, sm.numOutputChannels, (SInt16 *)ioData->mBuffers[iBuffer].mData+iChannel, thisNumChannels, inNumberFrames);
        }
    }

}

What are the list of things that I gotta change to make this library compatible with reading and writing VBR Data?

A