I would like to build an app which analyses the emotional content of speech from the mic.
THis does not, although sometimes used as an extra feature, involve speech recognition. Emotional analyses is based on prosodic features of the voice (pitch change, speed etc., tone).
I know this can be done on a desktop computer, but i dont want users to have to upload their recordings (phone conversations) to a server in order to get emotional feedback.
What i need is an API which either provides the whole analyses or an API which i can use to extract those features (i.e. the average speed of the conversation).
Is there such thing out there?
Thanks in advance!
Check this OpenEAR package, it should provide everything at the latest state art level
http://sourceforge.net/projects/openart/
Read about it here
http://www.mmk.ei.tum.de/publ/pdf/09/09eyb1.pdf
The Munich openEAR toolkit is a complete package for automatic speech emotion recognition. Its acronym stands for open Emotion and Affect Recognition Toolkit. It is based on the openSMILE feature extractor and thus is capable of real-time on-line emotion recognition. Pre-trained models on various standard corpora are included, as well as scripts and tools to quickly build and evaluate custom model sets. As classifier currently included are Support-Vector Machines using the LibSVM libray. Soon to come are also Bidirectional Long-Short-Term-Memory Recurrent Neural Nets, Discriminative Muli-nominal Bayesian Networks, and Lazy Learners.
openEAR is free software licensed under the GPL license. The first release (including model sets and pre-compiled openSMILE) will be available soon on Sourceforge: openEAR. Meanwhile, please refer to the openSMILE project, where we provide the feature extraction engine.