Synchronizing text and audio. Is there a NLP/speec

2019-03-09 12:45发布

I would like to synchronize a spoken recording against a known text. Is there a speech-to-text / natural language processing library that would facilitate this? I imagine I'd want to detect word boundaries and compute candidate matches from a dictionary. Most of the questions I've found on SO concern written language.

Desired, but not required:

Open Source
Compatible with American English out-of-the-box
Cross-platform
Thoroughly documented

Edit: I realize this is a very broad, even naive, question, so thanks in advance for your guidance.

What I've found so far:

OpenEars (iOS Sphinx/Flite wrapper)

标签： nlp speech-recognition pattern-recognition

1条回答

Root（大扎）

2楼-- · 2019-03-09 13:23

Forced Alignment

It sounds like you want to do forced alignment between your audio and the known text.

Pretty much all research/industry grade speech recognition systems will be able to do this, since forced alignment is an important part of training a recognition system on data that doesn't have phone level alignments between the audio and the transcript.

Alignment CMUSphinx

The Sphinx4-1.0 beta 5 release of CMU's open source speech recognition system now includes a demo on how to do alignment between a transcript and long speech recordings.

0人赞添加讨论(0) 举报

Synchronizing text and audio. Is there a NLP/speec

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间