TTS: Ivona SDK for iOS - impelentation in Project

2019-09-12 08:18发布

问题:

i am currently trying out ivona SDK for iOS, amazing voice and very very natural.
But the voice i am using (german female) have a voicefile with a filesize of 230 MB.
when i want to use 4 voices then my app is approximately 1GB big.

And also no use for offline. Is this voice just for the testphase? Or is it also for production?

I think its horrible to implement a few voices for a small TTS application so that the app size is very very huge...

can someone give me an answer to that?

回答1:

Perhaps the best solution would be to include no voices and allow the user to download which voice they would prefer to use. You could also unlock each voice as a separate in app purchase if you are attempting to monetize each voice.



回答2:

Voices for testing are indeed the same as for production. But at IVONA they have different sizes for each voice: You could opt to use IVONA voices for automotive/navigation systems. These voices are limited so they only are about ~70 MB in size, and they are at 16 kHz instead of 22 kHz. If you have a navigation app these are for you. Otherwise just give it a try, you may ask your contact at IVONA about this.

In my project we use 5 of these voices, each "vox" file is between 65-74 MB. But even these smaller voices grow the bundle pretty much (but not as much as your 230 MB) we decided to download them on demand (per IAP, hosted at Apple). Consider that users normally need only one language, so it'd be a waste of space to bundle more than one voice with the app.

Another option is to prepare a set of samples and bundle them instead of the IVONA voice. Of course this only works if you have a limited set of texts without dynamic parts. And maybe write a small sound queueing engine to splice sounds together, e.g. numbers.