I would like to synchronize a spoken recording against a known text. Is there a speech-to-text / natural language processing library that would facilitate this? I imagine I'd want to detect word boundaries and compute candidate matches from a dictionary. Most of the questions I've found on SO concern written language.
Desired, but not required:
- Open Source
- Compatible with American English out-of-the-box
- Cross-platform
- Thoroughly documented
Edit: I realize this is a very broad, even naive, question, so thanks in advance for your guidance.
What I've found so far:
- OpenEars (iOS Sphinx/Flite wrapper)