iOS AVFoundation audio/video out of sync

2020-06-03 08:40发布

问题:

The Problem:

During every playback, the audio is between 1-2 seconds behind the video.


The Setup:

The assets are loaded with AVURLAssets from a media stream.

To write the composition, I'm using AVMutableCompositions and AVMutableCompositionTracks with asymmetric timescales. The audio and video are both streamed to the device. The timescale for audio is 44100; the timescale for video is 600.

The playback is done with AVPlayer.


Attempted Solutions:

  • Using videoAssetTrack.timeRange for [composition insertTimeRange].
  • Using CMTimeRangeMake(kCMTimeZero, videoAssetTrack.duration);
  • Using CMTimeRangeMake(kCMTimeZero, videoAssetTrack.timeRange.duration);

The Code:

+(AVMutableComposition*)overlayAudio:(AVURLAsset*)audioAsset
                          withVideo:(AVURLAsset*)videoAsset
{
    AVMutableComposition* mixComposition = [AVMutableComposition composition];

    AVAssetTrack* audioTrack = [self getTrackFromAsset:audioAsset withMediaType:AVMediaTypeAudio];
    AVAssetTrack* videoTrack = [self getTrackFromAsset:videoAsset withMediaType:AVMediaTypeVideo];
    CMTime duration = videoTrack.timeRange.duration;

    AVMutableCompositionTrack* audioComposition = [self composeTrack:audioTrack withComposition:mixComposition andDuration:duration andMedia:AVMediaTypeAudio];
    AVMutableCompositionTrack* videoComposition = [self composeTrack:videoTrack withComposition:mixComposition andDuration:duration andMedia:AVMediaTypeVideo];
    [self makeAssertionAgainstAudio:audioComposition andVideo:videoComposition];
    return mixComposition;
}

+(AVAssetTrack*)getTrackFromAsset:(AVURLAsset*)asset withMediaType:(NSString*)mediaType
{
    return [[asset tracksWithMediaType:mediaType] objectAtIndex:0];
}

+(AVAssetExportSession*)configureExportSessionWithAsset:(AVMutableComposition*)composition toUrl:(NSURL*)url
{
    AVAssetExportSession* exportSession = [[AVAssetExportSession alloc] initWithAsset:composition presetName:AVAssetExportPresetHighestQuality];
    exportSession.outputFileType = @"com.apple.quicktime-movie";
    exportSession.outputURL = url;
    exportSession.shouldOptimizeForNetworkUse = YES;

    return exportSession;
}

-(IBAction)playVideo
{
    [avPlayer pause];
    avPlayerItem = [AVPlayerItem playerItemWithAsset:mixComposition];
    avPlayer = [[AVPlayer alloc]initWithPlayerItem:avPlayerItem];

    avPlayerLayer =[AVPlayerLayer playerLayerWithPlayer:avPlayer];
    [avPlayerLayer setFrame:CGRectMake(0, 0, 305, 283)];
    [avPlayerLayer setVideoGravity:AVLayerVideoGravityResizeAspectFill];
    [playerView.layer addSublayer:avPlayerLayer];

    [avPlayer seekToTime:kCMTimeZero];
    [avPlayer play];
}

Comments:

I don't understand much of the AVFoundation framework. It is entirely probable that I am simply misusing the snippets I have provided. (i.e. why "insertTimeRange" for composition?)

I can provide any other information needed for resolution - including debug asset track property values, network telemetry, streaming information, etc.

回答1:

If it's consistent, it appears there is an enforced delay to sample the audio properly. Apple's guides are usually easier to read than their accompanying books, however here is the specific note on delay.

https://developer.apple.com/library/ios/technotes/tn2258/_index.html

The programming guides will detail why/what.