I'm trying to reverse audio in iOS with AVAsset and AVAssetWriter.
The following code is working, but the output file is shorter than input.
For example, input file has 1:59 duration, but output 1:50 with the same audio content.
- (void)reverse:(AVAsset *)asset
{
AVAssetReader* reader = [[AVAssetReader alloc] initWithAsset:asset error:nil];
AVAssetTrack* audioTrack = [[asset tracksWithMediaType:AVMediaTypeAudio] objectAtIndex:0];
NSMutableDictionary* audioReadSettings = [NSMutableDictionary dictionary];
[audioReadSettings setValue:[NSNumber numberWithInt:kAudioFormatLinearPCM]
forKey:AVFormatIDKey];
AVAssetReaderTrackOutput* readerOutput = [AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:audioTrack outputSettings:audioReadSettings];
[reader addOutput:readerOutput];
[reader startReading];
NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
[NSNumber numberWithInt: kAudioFormatMPEG4AAC], AVFormatIDKey,
[NSNumber numberWithFloat:44100.0], AVSampleRateKey,
[NSNumber numberWithInt:2], AVNumberOfChannelsKey,
[NSNumber numberWithInt:128000], AVEncoderBitRateKey,
[NSData data], AVChannelLayoutKey,
nil];
AVAssetWriterInput *writerInput = [[AVAssetWriterInput alloc] initWithMediaType:AVMediaTypeAudio
outputSettings:outputSettings];
NSString *exportPath = [NSTemporaryDirectory() stringByAppendingPathComponent:@"out.m4a"];
NSURL *exportURL = [NSURL fileURLWithPath:exportPath];
NSError *writerError = nil;
AVAssetWriter *writer = [[AVAssetWriter alloc] initWithURL:exportURL
fileType:AVFileTypeAppleM4A
error:&writerError];
[writerInput setExpectsMediaDataInRealTime:NO];
[writer addInput:writerInput];
[writer startWriting];
[writer startSessionAtSourceTime:kCMTimeZero];
CMSampleBufferRef sample = [readerOutput copyNextSampleBuffer];
NSMutableArray *samples = [[NSMutableArray alloc] init];
while (sample != NULL) {
sample = [readerOutput copyNextSampleBuffer];
if (sample == NULL)
continue;
[samples addObject:(__bridge id)(sample)];
CFRelease(sample);
}
NSArray* reversedSamples = [[samples reverseObjectEnumerator] allObjects];
for (id reversedSample in reversedSamples) {
if (writerInput.readyForMoreMediaData) {
[writerInput appendSampleBuffer:(__bridge CMSampleBufferRef)(reversedSample)];
}
else {
[NSThread sleepForTimeInterval:0.05];
}
}
[writerInput markAsFinished];
dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0);
dispatch_async(queue, ^{
[writer finishWriting];
});
}
UPDATE:
If I write samples directly in first while
loop - everything is ok (even with writerInput.readyForMoreMediaData
checking). In this case result file has exactly the same duration as original. But if I write the same samples from reversed NSArray
- the result is shorter.
Print out the size of each buffer in number of samples (through the "reading" readerOuput while loop), and repeat in the "writing" writerInput for-loop. This way you can see all the buffer sizes and see if they add up.
For example, are you missing or skipping a buffer if (writerInput.readyForMoreMediaData)
is false, you "sleep", but then proceed to the next reversedSample in reversedSamples (that buffer effectively gets dropped from the writerInput)
UPDATE (based on comments):
I found in the code, there are two problems:
- The output settings is incorrect (the input file is mono (1 channel), but the output settings is configured to be 2 channels. It should be:
[NSNumber numberWithInt:1], AVNumberOfChannelsKey
. Look at the info on output and input files:
- The second problem is that you are reversing 643 buffers of 8192 audio samples, instead of reversing the index of each audio sample. To see each buffer, I changed your debugging from looking at the size of each sample to looking at the size of the buffer, which is 8192. So line 76 is now:
size_t sampleSize = CMSampleBufferGetNumSamples(sample);
The output looks like:
2015-03-19 22:26:28.171 audioReverse[25012:4901250] Reading [0]: 8192
2015-03-19 22:26:28.172 audioReverse[25012:4901250] Reading [1]: 8192
...
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [640]: 8192
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [641]: 8192
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Reading [642]: 5056
2015-03-19 22:26:28.651 audioReverse[25012:4901250] Writing [0]: 5056
2015-03-19 22:26:28.652 audioReverse[25012:4901250] Writing [1]: 8192
...
2015-03-19 22:26:29.134 audioReverse[25012:4901250] Writing [640]: 8192
2015-03-19 22:26:29.135 audioReverse[25012:4901250] Writing [641]: 8192
2015-03-19 22:26:29.135 audioReverse[25012:4901250] Writing [642]: 8192
This shows that you're reversing the order of each buffer of 8192 samples, but in each buffer the audio is still "facing forward". We can see this in this screen shot I took of a correctly reversed (sample-by-sample) versus your buffer reversal:
I think your current scheme can work if you also reverse each sample each 8192 buffer. I personally would not recommend using NSArray enumerators for signal-processing, but it can work if you operate at the sample-level.
It is not sufficient to write the audio samples in the reverse order. The sample data needs to be reversed itself, and its timing information needs to be properly set.
In Swift, we create an extension to AVAsset.
The samples must be processed as decompressed samples. To that end create audio reader settings with kAudioFormatLinearPCM:
let kAudioReaderSettings = [
AVFormatIDKey: Int(kAudioFormatLinearPCM) as AnyObject,
AVLinearPCMBitDepthKey: 16 as AnyObject,
AVLinearPCMIsBigEndianKey: false as AnyObject,
AVLinearPCMIsFloatKey: false as AnyObject,
AVLinearPCMIsNonInterleaved: false as AnyObject]
Use our AVAsset extension method audioReader:
func audioReader(outputSettings: [String : Any]?) -> (audioTrack:AVAssetTrack?, audioReader:AVAssetReader?,audioReaderOutput:AVAssetReaderTrackOutput?)
to create an audioReader (AVAssetReader) and audioReaderOutput (AVAssetReaderTrackOutput) for reading the audio samples.
We need to keep track of the audio sample and the new timing infomation:
var audioSamples:[CMSampleBuffer] = []
var timingInfos:[CMSampleTimingInfo] = []
Now start reading samples. And for each audio sample obtain its timing information to produce new timing information that will be relative to the end of the audio track (because we will be writing it back in reverse order).
In other words we will adjust the presentation times of the samples.
if let sampleBuffer = audioReaderOutput.copyNextSampleBuffer() {
// process sample
}
So to “process sample” we use CMSampleBufferGetSampleTimingInfoArray to get the timingInfo (CMSampleTimingInfo):
var timingInfo = CMSampleTimingInfo()
CMSampleBufferGetSampleTimingInfoArray(sampleBuffer, entryCount: 0, arrayToFill: &timingInfo, entriesNeededOut: &timingInfoCount)
Get the presentation time and duration:
let presentationTime = timingInfo.presentationTimeStamp
let duration = CMSampleBufferGetDuration(sampleBuffer)
Calculate the end time for the sample:
let endTime = CMTimeAdd(presentationTime, duration)
And now calculate the new presentation time relative to the end of the track:
let newPresentationTime = CMTimeSubtract(self.duration, endTime)
And use it to set the timingInfo:
timingInfo.presentationTimeStamp = newPresentationTime
Finally save the audio sample buffer and its timing info, we need it later when we create the reversed sample:
timingInfos.append(timingInfo)
audioSamples.append(sampleBuffer)
We need an AVAssetWriter:
guard let assetWriter = try? AVAssetWriter(outputURL: destinationURL, fileType: AVFileType.m4a) else {
// error handling
return
}
Now when writing the samples in reverse order with assetWriter they need to be compressed also, and we need settings for that. We also need a ‘source format hint’ and can acquire this from an uncompressed sample buffer:
let sampleBuffer = audioSamples[0]
let sourceFormat = CMSampleBufferGetFormatDescription(sampleBuffer)
let audioCompressionSettings = [AVFormatIDKey: kAudioFormatMPEG4AAC] as [String : Any]
Now we can create the AVAssetWriterInput, add it to the writer and start writing:
let assetWriterInput = AVAssetWriterInput(mediaType: AVMediaType.audio, outputSettings:audioCompressionSettings, sourceFormatHint: sourceFormat)
assetWriter.add(assetWriterInput)
assetWriter.startWriting()
assetWriter.startSession(atSourceTime: CMTime.zero)
Now iterate throught he samples, in reverse order, and for each reverse the samples themselves.
We have an extension for CMSampleBuffer that does just that, called ‘reverse’.
In lieu of using requestMediaDataWhenReady we do this as follows:
let nbrSamples = audioSamples.count
for index in 0...nbrSamples-1 {
while assetWriterInput.isReadyForMoreMediaData == false {
RunLoop.current.run(until: Date(timeIntervalSinceNow: 0.5))
}
if assetWriterInput.isReadyForMoreMediaData == true {
// process samples in reverse order
let sampleBuffer = audioSamples[nbrSamples - 1 - index]
let timingInfo = timingInfos[index]
// reverse samples data - note that it uses the timing info
if let reversedBuffer = sampleBuffer.reverse(timingInfo: [timingInfo]) {
// append data
if assetWriterInput.append(reversedBuffer) == false {
break
}
}
}
}
So the last thing to explain is how do you reverse the audio sample in the ‘reverse’ method?
We create an extension to CMSampleBuffer that takes a sample buffer and returns the properly timed reversed sample buffer, as an extension on CMSampleBuffer:
func reverse(timingInfo:[CMSampleTimingInfo]) -> CMSampleBuffer?
The data that has to be reversed needs to be obtained using the method:
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer
The CMSampleBuffer header files descibes this method as follows:
“Creates an AudioBufferList containing the data from the CMSampleBuffer, and a CMBlockBuffer which references (and manages the lifetime of) the data in that AudioBufferList.”
Call it as follows, where ‘self’ refers to the CMSampleBuffer we are reversing since this is an extension:
var blockBuffer: CMBlockBuffer? = nil
let audioBufferList: UnsafeMutableAudioBufferListPointer = AudioBufferList.allocate(maximumBuffers: 1)
CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer(
self,
bufferListSizeNeededOut: nil,
bufferListOut: audioBufferList.unsafeMutablePointer,
bufferListSize: AudioBufferList.sizeInBytes(maximumBuffers: 1),
blockBufferAllocator: nil,
blockBufferMemoryAllocator: nil,
flags: kCMSampleBufferFlag_AudioBufferList_Assure16ByteAlignment,
blockBufferOut: &blockBuffer
)
Now you can access the raw data as:
let data: UnsafeMutableRawPointer = audioBufferList.unsafePointer.pointee.mBuffers.mData
Reversing data we need to access the data as an array of ‘samples’ called sampleArray, and is done as follows in Swift:
var samples = data.assumingMemoryBound(to: Int16.self)
let sizeofInt16 = MemoryLayout<Int16>.size
let dataSize = audioBufferList.unsafePointer.pointee.mBuffers.mDataByteSize
let dataCount = Int(dataSize) / sizeofInt16
var sampleArray = Array(UnsafeBufferPointer(start: samples, count: dataCount)) as [Int16]
Now reverse the array sampleArray:
sampleArray.reverse()
Using the reversed samples we need to create a new CMSampleBuffer that contains the reversed samples and the new timing info which we generated previously while we read the audio samples from the source file.
Now we replace the data in the CMBlockBuffer we previously obtained with CMSampleBufferGetAudioBufferListWithRetainedBlockBuffer:
First reassign ‘samples’ using the reversed array:
samples = UnsafeMutablePointer(mutating: sampleArray)
guard CMBlockBufferReplaceDataBytes(with: samples, blockBuffer: blockBuffer!, offsetIntoDestination: 0, dataLength: Int(dataSize)) == noErr else {
return nil
}
Finally create the new sample buffer using CMSampleBufferCreate. This function needs two arguments we can get from the original sample buffer, namely the formatDescription and numberOfSamples:
let formatDescription = CMSampleBufferGetFormatDescription(self)
let numberOfSamples = CMSampleBufferGetNumSamples(self)
var newBuffer:CMSampleBuffer?
Now create the new sample buffer with the reversed blockBuffer and most notably the new timing information that was passed as an argument to the function ‘reverse’ that we are defining:
guard CMSampleBufferCreate(allocator: kCFAllocatorDefault, dataBuffer: blockBuffer, dataReady: true, makeDataReadyCallback: nil, refcon: nil, formatDescription: formatDescription, sampleCount: numberOfSamples, sampleTimingEntryCount: timingInfo.count, sampleTimingArray: timingInfo, sampleSizeEntryCount: 0, sampleSizeArray: nil, sampleBufferOut: &newBuffer) == noErr else {
return self
}
return newBuffer
And that’s all there is to it!
As a final note the Core Audio and AVFoundation headers provide a lot of useful information, such as CoreAudioTypes.h, CMSampleBuffer.h, and many more.
Complete example for reverse video and audio using Swift 5 into the same asset output, audio processed using above recommendations:
private func reverseVideo(inURL: URL, outURL: URL, queue: DispatchQueue, _ completionBlock: ((Bool)->Void)?) {
Log.info("Start reverse video!")
let asset = AVAsset.init(url: inURL)
guard
let reader = try? AVAssetReader.init(asset: asset),
let videoTrack = asset.tracks(withMediaType: .video).first,
let audioTrack = asset.tracks(withMediaType: .audio).first
else {
assert(false)
completionBlock?(false)
return
}
let width = videoTrack.naturalSize.width
let height = videoTrack.naturalSize.height
// Video reader
let readerVideoSettings: [String : Any] = [ String(kCVPixelBufferPixelFormatTypeKey) : kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,]
let readerVideoOutput = AVAssetReaderTrackOutput.init(track: videoTrack, outputSettings: readerVideoSettings)
reader.add(readerVideoOutput)
// Audio reader
let readerAudioSettings: [String : Any] = [
AVFormatIDKey: kAudioFormatLinearPCM,
AVLinearPCMBitDepthKey: 16 ,
AVLinearPCMIsBigEndianKey: false ,
AVLinearPCMIsFloatKey: false,]
let readerAudioOutput = AVAssetReaderTrackOutput.init(track: audioTrack, outputSettings: readerAudioSettings)
reader.add(readerAudioOutput)
//Start reading content
reader.startReading()
//Reading video samples
var videoBuffers = [CMSampleBuffer]()
while let nextBuffer = readerVideoOutput.copyNextSampleBuffer() {
videoBuffers.append(nextBuffer)
}
//Reading audio samples
var audioBuffers = [CMSampleBuffer]()
var timingInfos = [CMSampleTimingInfo]()
while let nextBuffer = readerAudioOutput.copyNextSampleBuffer() {
var timingInfo = CMSampleTimingInfo()
var timingInfoCount = CMItemCount()
CMSampleBufferGetSampleTimingInfoArray(nextBuffer, entryCount: 0, arrayToFill: &timingInfo, entriesNeededOut: &timingInfoCount)
let duration = CMSampleBufferGetDuration(nextBuffer)
let endTime = CMTimeAdd(timingInfo.presentationTimeStamp, duration)
let newPresentationTime = CMTimeSubtract(duration, endTime)
timingInfo.presentationTimeStamp = newPresentationTime
timingInfos.append(timingInfo)
audioBuffers.append(nextBuffer)
}
//Stop reading
let status = reader.status
reader.cancelReading()
guard status == .completed, let firstVideoBuffer = videoBuffers.first, let firstAudioBuffer = audioBuffers.first else {
assert(false)
completionBlock?(false)
return
}
//Start video time
let sessionStartTime = CMSampleBufferGetPresentationTimeStamp(firstVideoBuffer)
//Writer for video
let writerVideoSettings: [String:Any] = [
AVVideoCodecKey : AVVideoCodecType.h264,
AVVideoWidthKey : width,
AVVideoHeightKey: height,
]
let writerVideoInput: AVAssetWriterInput
if let formatDescription = videoTrack.formatDescriptions.last {
writerVideoInput = AVAssetWriterInput.init(mediaType: .video, outputSettings: writerVideoSettings, sourceFormatHint: (formatDescription as! CMFormatDescription))
} else {
writerVideoInput = AVAssetWriterInput.init(mediaType: .video, outputSettings: writerVideoSettings)
}
writerVideoInput.transform = videoTrack.preferredTransform
writerVideoInput.expectsMediaDataInRealTime = false
//Writer for audio
let writerAudioSettings: [String:Any] = [
AVFormatIDKey : kAudioFormatMPEG4AAC,
AVSampleRateKey : 44100,
AVNumberOfChannelsKey: 2,
AVEncoderBitRateKey:128000,
AVChannelLayoutKey: NSData(),
]
let sourceFormat = CMSampleBufferGetFormatDescription(firstAudioBuffer)
let writerAudioInput: AVAssetWriterInput = AVAssetWriterInput.init(mediaType: .audio, outputSettings: writerAudioSettings, sourceFormatHint: sourceFormat)
writerAudioInput.expectsMediaDataInRealTime = true
guard
let writer = try? AVAssetWriter.init(url: outURL, fileType: .mp4),
writer.canAdd(writerVideoInput),
writer.canAdd(writerAudioInput)
else {
assert(false)
completionBlock?(false)
return
}
let pixelBufferAdaptor = AVAssetWriterInputPixelBufferAdaptor.init(assetWriterInput: writerVideoInput, sourcePixelBufferAttributes: nil)
let group = DispatchGroup.init()
group.enter()
writer.add(writerVideoInput)
writer.add(writerAudioInput)
writer.startWriting()
writer.startSession(atSourceTime: sessionStartTime)
var videoFinished = false
var audioFinished = false
//Write video samples in reverse order
var currentSample = 0
writerVideoInput.requestMediaDataWhenReady(on: queue) {
for i in currentSample..<videoBuffers.count {
currentSample = i
if !writerVideoInput.isReadyForMoreMediaData {
return
}
let presentationTime = CMSampleBufferGetPresentationTimeStamp(videoBuffers[i])
guard let imageBuffer = CMSampleBufferGetImageBuffer(videoBuffers[videoBuffers.count - i - 1]) else {
Log.info("VideoWriter reverseVideo: warning, could not get imageBuffer from SampleBuffer...")
continue
}
if !pixelBufferAdaptor.append(imageBuffer, withPresentationTime: presentationTime) {
Log.info("VideoWriter reverseVideo: warning, could not append imageBuffer...")
}
}
// finish write video samples
writerVideoInput.markAsFinished()
Log.info("Video writing finished!")
videoFinished = true
if(audioFinished){
group.leave()
}
}
//Write audio samples in reverse order
let totalAudioSamples = audioBuffers.count
writerAudioInput.requestMediaDataWhenReady(on: queue) {
for i in 0..<totalAudioSamples-1 {
if !writerAudioInput.isReadyForMoreMediaData {
return
}
let audioSample = audioBuffers[totalAudioSamples-1-i]
let timingInfo = timingInfos[i]
// reverse samples data using timing info
if let reversedBuffer = audioSample.reverse(timingInfo: [timingInfo]) {
// append data
if writerAudioInput.append(reversedBuffer) == false {
break
}
}
}
// finish
writerAudioInput.markAsFinished()
Log.info("Audio writing finished!")
audioFinished = true
if(videoFinished){
group.leave()
}
}
group.notify(queue: queue) {
writer.finishWriting {
if writer.status != .completed {
Log.info("VideoWriter reverse video: error - \(String(describing: writer.error))")
completionBlock?(false)
} else {
Log.info("Ended reverse video!")
completionBlock?(true)
}
}
}
}
Happy coding!