How to generate audio wave form programmatically w

2020-05-15 00:29发布

问题:

How to generate audio wave form programmatically while recording Voice in iOS?

m working on voice modulation audio frequency in iOS... everything is working fine ...just need some best simple way to generate audio wave form on detection noise...

Please dont refer me the code tutorials of...speakhere and auriotouch... i need some best suggestions from native app developers.

I have recorded the audio and i made it play after recording . I have created waveform and attached screenshot . But it has to been drawn in the view as audio recording in progress

-(UIImage *) audioImageGraph:(SInt16 *) samples
                normalizeMax:(SInt16) normalizeMax
                 sampleCount:(NSInteger) sampleCount
                channelCount:(NSInteger) channelCount
                 imageHeight:(float) imageHeight {

    CGSize imageSize = CGSizeMake(sampleCount, imageHeight);
    UIGraphicsBeginImageContext(imageSize);
    CGContextRef context = UIGraphicsGetCurrentContext();

    CGContextSetFillColorWithColor(context, [UIColor blackColor].CGColor);
    CGContextSetAlpha(context,1.0);
    CGRect rect;
    rect.size = imageSize;
    rect.origin.x = 0;
    rect.origin.y = 0;

    CGColorRef leftcolor = [[UIColor whiteColor] CGColor];
    CGColorRef rightcolor = [[UIColor redColor] CGColor];

    CGContextFillRect(context, rect);

    CGContextSetLineWidth(context, 1.0);

    float halfGraphHeight = (imageHeight / 2) / (float) channelCount ;
    float centerLeft = halfGraphHeight;
    float centerRight = (halfGraphHeight*3) ;
    float sampleAdjustmentFactor = (imageHeight/ (float) channelCount) / (float) normalizeMax;

    for (NSInteger intSample = 0 ; intSample < sampleCount ; intSample ++ ) {
        SInt16 left = *samples++;
        float pixels = (float) left;
        pixels *= sampleAdjustmentFactor;
        CGContextMoveToPoint(context, intSample, centerLeft-pixels);
        CGContextAddLineToPoint(context, intSample, centerLeft+pixels);
        CGContextSetStrokeColorWithColor(context, leftcolor);
        CGContextStrokePath(context);

        if (channelCount==2) {
            SInt16 right = *samples++;
            float pixels = (float) right;
            pixels *= sampleAdjustmentFactor;
            CGContextMoveToPoint(context, intSample, centerRight - pixels);
            CGContextAddLineToPoint(context, intSample, centerRight + pixels);
            CGContextSetStrokeColorWithColor(context, rightcolor);
            CGContextStrokePath(context);
        }
    }

    // Create new image
    UIImage *newImage = UIGraphicsGetImageFromCurrentImageContext();

    // Tidy up
    UIGraphicsEndImageContext();

    return newImage;
}

Next a method that takes a AVURLAsset, and returns PNG Data

- (NSData *) renderPNGAudioPictogramForAssett:(AVURLAsset *)songAsset {

    NSError * error = nil;


    AVAssetReader * reader = [[AVAssetReader alloc] initWithAsset:songAsset error:&error];

    AVAssetTrack * songTrack = [songAsset.tracks objectAtIndex:0];

    NSDictionary* outputSettingsDict = [[NSDictionary alloc] initWithObjectsAndKeys:

                                        [NSNumber numberWithInt:kAudioFormatLinearPCM],AVFormatIDKey,
                                        //     [NSNumber numberWithInt:44100.0],AVSampleRateKey, /*Not Supported*/
                                        //     [NSNumber numberWithInt: 2],AVNumberOfChannelsKey,    /*Not Supported*/

                                        [NSNumber numberWithInt:16],AVLinearPCMBitDepthKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsBigEndianKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                        [NSNumber numberWithBool:NO],AVLinearPCMIsNonInterleaved,

                                        nil];


    AVAssetReaderTrackOutput* output = [[AVAssetReaderTrackOutput alloc] initWithTrack:songTrack outputSettings:outputSettingsDict];

    [reader addOutput:output];
    [output release];

    UInt32 sampleRate,channelCount;

    NSArray* formatDesc = songTrack.formatDescriptions;
    for(unsigned int i = 0; i < [formatDesc count]; ++i) {
        CMAudioFormatDescriptionRef item = (CMAudioFormatDescriptionRef)[formatDesc objectAtIndex:i];
        const AudioStreamBasicDescription* fmtDesc = CMAudioFormatDescriptionGetStreamBasicDescription (item);
        if(fmtDesc ) {

            sampleRate = fmtDesc->mSampleRate;
            channelCount = fmtDesc->mChannelsPerFrame;

            //    NSLog(@"channels:%u, bytes/packet: %u, sampleRate %f",fmtDesc->mChannelsPerFrame, fmtDesc->mBytesPerPacket,fmtDesc->mSampleRate);
        }
    }


    UInt32 bytesPerSample = 2 * channelCount;
    SInt16 normalizeMax = 0;

    NSMutableData * fullSongData = [[NSMutableData alloc] init];
    [reader startReading];


    UInt64 totalBytes = 0;


    SInt64 totalLeft = 0;
    SInt64 totalRight = 0;
    NSInteger sampleTally = 0;

    NSInteger samplesPerPixel = sampleRate / 50;


    while (reader.status == AVAssetReaderStatusReading){

        AVAssetReaderTrackOutput * trackOutput = (AVAssetReaderTrackOutput *)[reader.outputs objectAtIndex:0];
        CMSampleBufferRef sampleBufferRef = [trackOutput copyNextSampleBuffer];

        if (sampleBufferRef){
            CMBlockBufferRef blockBufferRef = CMSampleBufferGetDataBuffer(sampleBufferRef);

            size_t length = CMBlockBufferGetDataLength(blockBufferRef);
            totalBytes += length;


            NSAutoreleasePool *wader = [[NSAutoreleasePool alloc] init];

            NSMutableData * data = [NSMutableData dataWithLength:length];
            CMBlockBufferCopyDataBytes(blockBufferRef, 0, length, data.mutableBytes);


            SInt16 * samples = (SInt16 *) data.mutableBytes;
            int sampleCount = length / bytesPerSample;
            for (int i = 0; i < sampleCount ; i ++) {

                SInt16 left = *samples++;

                totalLeft  += left;



                SInt16 right;
                if (channelCount==2) {
                    right = *samples++;

                    totalRight += right;
                }

                sampleTally++;

                if (sampleTally > samplesPerPixel) {

                    left  = totalLeft / sampleTally;

                    SInt16 fix = abs(left);
                    if (fix > normalizeMax) {
                        normalizeMax = fix;
                    }


                    [fullSongData appendBytes:&left length:sizeof(left)];

                    if (channelCount==2) {
                        right = totalRight / sampleTally;


                        SInt16 fix = abs(right);
                        if (fix > normalizeMax) {
                            normalizeMax = fix;
                        }


                        [fullSongData appendBytes:&right length:sizeof(right)];
                    }

                    totalLeft   = 0;
                    totalRight  = 0;
                    sampleTally = 0;

                }
            }



            [wader drain];


            CMSampleBufferInvalidate(sampleBufferRef);

            CFRelease(sampleBufferRef);
        }
    }


    NSData * finalData = nil;

    if (reader.status == AVAssetReaderStatusFailed || reader.status == AVAssetReaderStatusUnknown){
        // Something went wrong. return nil

        return nil;
    }

    if (reader.status == AVAssetReaderStatusCompleted){

        NSLog(@"rendering output graphics using normalizeMax %d",normalizeMax);

        UIImage *test = [self audioImageGraph:(SInt16 *)
                         fullSongData.bytes
                                 normalizeMax:normalizeMax
                                  sampleCount:fullSongData.length / 4
                                 channelCount:2
                                  imageHeight:100];

        finalData = imageToData(test);
    }




    [fullSongData release];
    [reader release];

    return finalData;
}

I have

回答1:

If you want real-time graphics derived from mic input, then use the RemoteIO Audio Unit, which is what most native iOS app developers use for low latency audio, and Metal or Open GL for drawing waveforms, which will give you the highest frame rates. You will need completely different code from that provided in your question to do so, as AVAssetRecording, Core Graphic line drawing and png rendering are far far too slow to use.

Update: with iOS 8 and newer, the Metal API may be able to render graphic visualizations with even greater performance than OpenGL.

Uodate 2: Here are some code snippets for recording live audio using Audio Units and drawing bit maps using Metal in Swift 3: https://gist.github.com/hotpaw2/f108a3c785c7287293d7e1e81390c20b



回答2:

You should check out EZAudio (https://github.com/syedhali/EZAudio), specifically the EZRecorder and the EZAudioPlot (or GPU-accelerated EZAudioPlotGL).

There is also an example project that does exactly what you want, https://github.com/syedhali/EZAudio/tree/master/EZAudioExamples/iOS/EZAudioRecordExample

EDIT: Here's the code inline

/// In your interface

/**
 Use a OpenGL based plot to visualize the data coming in
 */
@property (nonatomic,weak) IBOutlet EZAudioPlotGL *audioPlot;
/**
 The microphone component
 */
@property (nonatomic,strong) EZMicrophone *microphone;
/**
 The recorder component
 */
@property (nonatomic,strong) EZRecorder *recorder;

...

/// In your implementation

// Create an instance of the microphone and tell it to use this view controller instance as the delegate
-(void)viewDidLoad {
    self.microphone = [EZMicrophone microphoneWithDelegate:self startsImmediately:YES];
}

// EZMicrophoneDelegate will provide these callbacks
-(void)microphone:(EZMicrophone *)microphone
 hasAudioReceived:(float **)buffer
   withBufferSize:(UInt32)bufferSize
withNumberOfChannels:(UInt32)numberOfChannels {
  dispatch_async(dispatch_get_main_queue(),^{
    // Updates the audio plot with the waveform data
    [self.audioPlot updateBuffer:buffer[0] withBufferSize:bufferSize];
  });
}

-(void)microphone:(EZMicrophone *)microphone hasAudioStreamBasicDescription:(AudioStreamBasicDescription)audioStreamBasicDescription {
  // The AudioStreamBasicDescription of the microphone stream. This is useful when configuring the EZRecorder or telling another component what audio format type to expect.

  // We can initialize the recorder with this ASBD
  self.recorder = [EZRecorder recorderWithDestinationURL:[self testFilePathURL]
                                         andSourceFormat:audioStreamBasicDescription];

}

-(void)microphone:(EZMicrophone *)microphone
    hasBufferList:(AudioBufferList *)bufferList
   withBufferSize:(UInt32)bufferSize
withNumberOfChannels:(UInt32)numberOfChannels {

  // Getting audio data as a buffer list that can be directly fed into the EZRecorder. This is happening on the audio thread - any UI updating needs a GCD main queue block. This will keep appending data to the tail of the audio file.
  if( self.isRecording ){
    [self.recorder appendDataFromBufferList:bufferList
                             withBufferSize:bufferSize];
  }

}


回答3:

I was searching the same thing. (Making wave from the data of the audio recorder). I found some library that might be helpful and worth to check the code to understand the logic behind this.

The calculation is all based with sin and mathematic formula. This is much simple if you take a look to the code!

https://github.com/stefanceriu/SCSiriWaveformView

or

https://github.com/raffael/SISinusWaveView

This is only few examples that you can find on the web.