I've found lots of examples online for working with audio in iOS, but most of them are pretty outdated and don't apply to what I'm trying to accomplish. Here's my project:
I need to capture audio samples from two sources - microphone input and stored audio files. I need to perform FFT on these samples to produce a "fingerprint" for the entire clip, as well as apply some additional filters. The ultimate goal is to build a sort of song-recognition software similar to Shazam, etc.
What is the best way to capture the individual audio samples in iOS 8 for performing a Fast Fourier Transform? I imagine ending up with a large array of them, but I suspect that it might not work quite like that. Secondly, how can I use the Accelerate framework for processing the audio? It seems to be the most efficient way to perform complex analysis on audio in iOS.
All the examples I've seen online are using older versions of iOS and Objective-C, and I haven't been able to successfully translate them into Swift. Does iOS 8 provide some new frameworks for this sort of thing?
AVAudioEngine is the way to go for this. From Apple's docs:
- For playback and recording of a single track, use AVAudioPlayer and AVAudioRecorder.
- For more complex audio processing, use AVAudioEngine. AVAudioEngine includes AVAudioInputNode and AVAudioOutputNode for audio input and output. You can also use AVAudioNode objects for processing and mixing effects into your audio
I'll be straight with you: AVAudioEngine is an extremely finicky API with vague documentation, rarely-helpful error messaging, and almost no online code examples demonstrating more than the most basic tasks. BUT if you take the time to get over the small learning curve, you can really do some magical things with it relatively easily.
I've built a simple "playground" view controller that demonstrates both microphone and audio file sampling working in tandem:
import UIKit
class AudioEnginePlaygroundViewController: UIViewController {
private var audioEngine: AVAudioEngine!
private var mic: AVAudioInputNode!
private var micTapped = false
override func viewDidLoad() {
super.viewDidLoad()
configureAudioSession()
audioEngine = AVAudioEngine()
mic = audioEngine.inputNode!
}
static func getController() -> AudioEnginePlaygroundViewController {
let me = AudioEnginePlaygroundViewController(nibName: "AudioEnginePlaygroundViewController", bundle: nil)
return me
}
@IBAction func toggleMicTap(_ sender: Any) {
if micTapped {
mic.removeTap(onBus: 0)
micTapped = false
return
}
let micFormat = mic.inputFormat(forBus: 0)
mic.installTap(onBus: 0, bufferSize: 2048, format: micFormat) { (buffer, when) in
let sampleData = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))
}
micTapped = true
startEngine()
}
@IBAction func playAudioFile(_ sender: Any) {
stopAudioPlayback()
let playerNode = AVAudioPlayerNode()
let audioUrl = Bundle.main.url(forResource: "test_audio", withExtension: "wav")!
let audioFile = readableAudioFileFrom(url: audioUrl)
audioEngine.attach(playerNode)
audioEngine.connect(playerNode, to: audioEngine.outputNode, format: audioFile.processingFormat)
startEngine()
playerNode.scheduleFile(audioFile, at: nil) {
playerNode .removeTap(onBus: 0)
}
playerNode.installTap(onBus: 0, bufferSize: 4096, format: playerNode.outputFormat(forBus: 0)) { (buffer, when) in
let sampleData = UnsafeBufferPointer(start: buffer.floatChannelData![0], count: Int(buffer.frameLength))
}
playerNode.play()
}
// MARK: Internal Methods
private func configureAudioSession() {
do {
try AVAudioSession.sharedInstance().setCategory(AVAudioSessionCategoryPlayAndRecord, with: [.mixWithOthers, .defaultToSpeaker])
try AVAudioSession.sharedInstance().setActive(true)
} catch { }
}
private func readableAudioFileFrom(url: URL) -> AVAudioFile {
var audioFile: AVAudioFile!
do {
try audioFile = AVAudioFile(forReading: url)
} catch { }
return audioFile
}
private func startEngine() {
guard !audioEngine.isRunning else {
return
}
do {
try audioEngine.start()
} catch { }
}
private func stopAudioPlayback() {
audioEngine.stop()
audioEngine.reset()
}
}
The audio samples are given to you via installTap's completion handler which is continuously called as audio passes through the tapped node (either the mic or the audio file player) in real time. You can access individual samples by indexing the sampleData pointer that I've created in each block.
swift
Recording in iOS:
- Create and maintain an instance of an
AVAudioRecorder
, as in var audioRecorder: AVAudioRecorder? = nil
- Initialize your
AVAudioRecorder
with a URL to store the samples and some record settings
The recording session sequence:
- invoke
prepareToRecord()
- invoke
record()
- invoke
stop()
Complete Swift/AVAudioRecorder Example
At the heart of your recording method, you could have:
func record() {
self.prepareToRecord()
if let recorder = self.audioRecorder {
recorder.record()
}
}
To prepare the recording (streaming to a file
), you could have:
func prepareToRecord() {
var error: NSError?
let documentsPath = NSSearchPathForDirectoriesInDomains(.DocumentDirectory, .UserDomainMask, true)[0] as! NSString
let soundFileURL: NSURL? = NSURL.fileURLWithPath("\(documentsPath)/recording.caf")
self.audioRecorder = AVAudioRecorder(URL: soundFileURL, settings: recordSettings as [NSObject : AnyObject], error: &error)
if let recorder = self.audioRecorder {
recorder.prepareToRecord()
}
}
Finally, to stop the recording, use this:
func stopRecording() {
if let recorder = self.audioRecorder {
recorder.stop()
}
}
Example above also needs import AVFoundation
and some recordSettings
, left to your choice. An example of recordSettings
may look like this:
let recordSettings = [
AVFormatIDKey: kAudioFormatAppleLossless,
AVEncoderAudioQualityKey : AVAudioQuality.Max.rawValue,
AVEncoderBitRateKey : 320000,
AVNumberOfChannelsKey: 2,
AVSampleRateKey : 44100.0
]
Do this, you're done.
You may also want to check out this Stack Overflow answer, which includes a demo project.