I have the Dragon Mobile SDK running nicely on Windows Phone 7 and I would like to get the equivalent functionality working for iOS. Since the SDK wraps the microphone, it's not really possible to use the .NET assemblies in my MonoTouch project (even if I did have the source). It appears that the best way to do this is to create a binding library (as Miguel describes here).
It sure seems like a lot of work though, and I would love to reuse as opposed to reinventing the wheel if someone's done it already...
Here are some more details for how I got this to work.
- I downloaded the binding sample. You may be tempted to skip this step, but you really have to start with this project if you want to get this to work.
I created an objective-c library with Xcode (which I called SpeechKitLibrary) that has a dual purpose - one is to define the SpeechKitApplicationKey (which is an extern dependency that SpeechKit needs):
const unsigned char SpeechKitApplicationKey[] = {...};
and the other is to define a class which utilizes the SpeechKit framework, and links with it. (in Xcode, add the SpeechKit framework in the frameworks section of the project).
The .m file I wrote looks something like this... (you can figure out the .h file - super simple). I'm not 100% sure you need all of this, but I wanted to make sure the static archive library that came out of this step would import the right symbols. You may be able to avoid this somehow, but in my experiments I found that I needed to do something like this...
// the SpeechKitWrapper isn't actually used - rather, it is a way to exercise all the API's that
// the binding library needs from the SpeechKit framework, so that those can be linked into the generated .a file.
@implementation SpeechKitWrapper
@synthesize status;
- (id)initWithDelegate:(id <SKRecognizerDelegate>)delegate
{
self = [super init];
if (self) {
del = delegate;
[self setStatus:@"initializing"];
SpeechKit setupWithID:@"NMDPTRIAL_ogazitt20120220010133"
host:@"sandbox.nmdp.nuancemobility.net"
port:443
useSSL:NO
delegate:nil];
NSString *text = [NSString stringWithFormat:@"initialized. sessionid = %@", [SpeechKit sessionID]];
[self setStatus:text];
SKEarcon* earconStart = [SKEarcon earconWithName:@"beep.wav"];
[SpeechKit setEarcon:earconStart forType:SKStartRecordingEarconType];
voiceSearch = [[SKRecognizer alloc] initWithType:SKDictationRecognizerType
detection:SKLongEndOfSpeechDetection
language:@"en_US"
delegate:delegate];
text = [NSString stringWithFormat:@"recognizer connecting. sessionid = %@", [SpeechKit sessionID]];
[self setStatus:text];
}
return self;
}
@end
I then compiled/linked this static archive for the three different architectures - i386, arm6, and arm7. The Makefile in the BindingSample is the template for how to do this. But the net is that you get three libraries - libSpeechKitLibrary-{i386,arm6,arm7}.a. The makefile then creates a universal library (libSpeechKitLibraryUniversal.a) using the OSX lipo(1) tool.
Only now are you ready to create a binding library. You can reuse the AssemblyInfo.cs in the binding sample (which will show how to create an import on the universal library for all architectures - and will drive some compile flags)...
[assembly: LinkWith ("libSpeechKitLibraryUniversal.a", LinkTarget.Simulator | LinkTarget.ArmV6 | LinkTarget.ArmV7, ForceLoad = true)]
You compile the ApiDefinition.cs file with btouch as per the Makefile (I think I needed to repeat some of the info in StructsAndEnums.cs to make it work). Note - the only functionality I didn't get to work is the "SetEarcon" stuff - since this is an archive library and not a framework, I can't bundle a wav as a resource file... and I couldn't figure out how to get the SetEarcon method to accept a resource out of my app bundle.
using System;
using MonoTouch.Foundation;
namespace Nuance.SpeechKit
{
// SKEarcon.h
public enum SKEarconType
{
SKStartRecordingEarconType = 1,
SKStopRecordingEarconType = 2,
SKCancelRecordingEarconType = 3,
};
// SKRecognizer.h
public enum SKEndOfSpeechDetection
{
SKNoEndOfSpeechDetection = 1,
SKShortEndOfSpeechDetection = 2,
SKLongEndOfSpeechDetection = 3,
};
public static class SKRecognizerType
{
public static string SKDictationRecognizerType = "dictation";
public static string SKWebSearchRecognizerType = "websearch";
};
// SpeechKitErrors.h
public enum SpeechKitErrors
{
SKServerConnectionError = 1,
SKServerRetryError = 2,
SKRecognizerError = 3,
SKVocalizerError = 4,
SKCancelledError = 5,
};
// SKEarcon.h
[BaseType(typeof(NSObject))]
interface SKEarcon
{
[Export("initWithContentsOfFile:")]
IntPtr Constructor(string path);
[Static, Export("earconWithName:")]
SKEarcon FromName(string name);
}
// SKRecognition.h
[BaseType(typeof(NSObject))]
interface SKRecognition
{
[Export("results")]
string[] Results { get; }
[Export("scores")]
NSNumber[] Scores { get; }
[Export("suggestion")]
string Suggestion { get; }
[Export("firstResult")]
string FirstResult();
}
// SKRecognizer.h
[BaseType(typeof(NSObject))]
interface SKRecognizer
{
[Export("audioLevel")]
float AudioLevel { get; }
[Export ("initWithType:detection:language:delegate:")]
IntPtr Constructor (string type, SKEndOfSpeechDetection detection, string language, SKRecognizerDelegate del);
[Export("stopRecording")]
void StopRecording();
[Export("cancel")]
void Cancel();
/*
[Field ("SKSearchRecognizerType", "__Internal")]
NSString SKSearchRecognizerType { get; }
[Field ("SKDictationRecognizerType", "__Internal")]
NSString SKDictationRecognizerType { get; }
*/
}
[BaseType(typeof(NSObject))]
[Model]
interface SKRecognizerDelegate
{
[Export("recognizerDidBeginRecording:")]
void OnRecordingBegin (SKRecognizer recognizer);
[Export("recognizerDidFinishRecording:")]
void OnRecordingDone (SKRecognizer recognizer);
[Export("recognizer:didFinishWithResults:")]
[Abstract]
void OnResults (SKRecognizer recognizer, SKRecognition results);
[Export("recognizer:didFinishWithError:suggestion:")]
[Abstract]
void OnError (SKRecognizer recognizer, NSError error, string suggestion);
}
// speechkit.h
[BaseType(typeof(NSObject))]
interface SpeechKit
{
[Static, Export("setupWithID:host:port:useSSL:delegate:")]
void Initialize(string id, string host, int port, bool useSSL, [NullAllowed] SpeechKitDelegate del);
[Static, Export("destroy")]
void Destroy();
[Static, Export("sessionID")]
string GetSessionID();
[Static, Export("setEarcon:forType:")]
void SetEarcon(SKEarcon earcon, SKEarconType type);
}
[BaseType(typeof(NSObject))]
[Model]
interface SpeechKitDelegate
{
[Export("destroyed")]
void Destroyed();
}
[BaseType(typeof(NSObject))]
interface SpeechKitWrapper
{
[Export("initWithDelegate:")]
IntPtr Constructor(SKRecognizerDelegate del);
[Export("status")]
string Status { get; set; }
}
}
You now have an assembly that can be referenced by your monotouch application project. The important thing now is to remember to link with all the frameworks that are dependencies (not only SpeeckKit, but also SK's dependencies) - you do this by passing mtouch some additional arguments:
-gcc_flags "-F<insert_framework_path_here> -framework SpeechKit -framework SystemConfiguration -framework Security -framework AVFoundation -framework AudioToolbox"
That's all, folks! Hope this was helpful...
If anyone (kos or otherwise) gets the SetEarcon method to work, please post a solution :-)
Nuance's SDK Agreement is not permissive enough for anyone to even publish bindings for their iOS SDK for use with MonoTouch. But the library itself should work just fine.
That being said, the SDK has only a handful of types to map and would be fairly trivial to RE-do the work anyone else might have already done. You can check out how to bind assemblies using the reference guide here:
http://docs.xamarin.com/ios/advanced_topics/binding_objective-c_types
There's also a BindingSample project that helps users better understand how to bind native components using btouch:
https://github.com/xamarin/monotouch-samples/tree/master/BindingSample
Thanks again Anuj for your answer. I thought I'd leave a tip or two about how to do this. The binding library wasn't difficult to build (still tweaking it but it's not a difficult task).
The more obscure part was figuring out how to get the SpeechKit framework linked. The samples only show how to link a .a or .dylib. After spending a little time with the ld(1) man page on OSX, it looks like the correct ld (and therefore gcc) arguments for linking with a framework are the following:
-gcc_flags "-F<insert_framework_path_here> -framework SpeechKit"
You put this in a textbox in the project properties - under Build :: iPhone Build :: Additional mtouch arguments
Note that -L doesn't work because this isn't a library; also note that -force_load and -ObjC referenced here don't appear necessary because, again, this is a framework and not a library.