Sphinx4 ConfidenceResult and SpeechResult

2019-06-04 03:46发布

问题:

I'm trying to get the confidence score of a SpeechResult by doing

ConfidenceResult cr = scorer.score(result);

Where result is a SpeechResult and scorer is a ConfidenceScorer. As it turns out this isn't allowed. Is there some way around this that I'm not seeing, besides using a Result type?

回答1:

Yes, you can do this, although it's a little bit roundabout. A confidence result is actually a Sausage (no, not kidding, that's what it's called: SphinxDocs:Sausage. Although it's also known as a Word Confusion Network, it's sometimes referred to as a sausage because of what the graph looks like. See Fig 1. of Hakkani-Tur, et. al.. That paper is a great reference for understanding confidence and speech recognition, although it is a bit long, I highly recommend reading the sections you might find relevant if you're interested in further work in Speech. It describes the Pivot Algorithm, which is used in Sphinx 4 in the class: PivotSausageMaker).

Anyway, the point is that you can get a Lattice from your SpeechResult. A Lattice is a graph that is a condensed form of all the hypotheses the recognizer produced. You can give your lattice to a SausageMaker, and call SausageMaker.makeSausage(), which will give you a Sausage, which is a ConfidenceResult (note: calling SausageMaker.score(Result result) just makes a Lattice from the result, and then calls it's own makeSausage method). Unfortunately ASR confidence values are not very clear, and it's an open topic of research how to best compute, process and understand them.

Another possibility would be to the the confidence scores in the WordResult's you can get from your SpeechResult.

Hope that helps!