Why isn't speech recognition advancing? [close

2019-03-09 04:24发布

What's so difficult about the subject that algorithm designers are having a hard time tackling it?

Is it really that complex?

I'm having a hard time grasping why this topic is so problematic. Can anyone give me an example as to why this is the case?

21条回答
叛逆
2楼-- · 2019-03-09 04:43

The basic problem is that human language is ambiguous. Therefore, in order to understand speech, the computer (or human) needs to understand the context of what is being spoken. That context is actually the physical world the speaker and listener inhabit. And no AI program has yet demonstrated having adeep understanding of the physical world.

查看更多
我欲成王,谁敢阻挡
3楼-- · 2019-03-09 04:43

To recognize speech well, you need to know what people mean - and computers aren't there yet at all.

查看更多
▲ chillily
4楼-- · 2019-03-09 04:44

The variety in language would be the predominant factor, making it difficult. Dialects and accents would make this more complicated. Also, context. The book was read. The book was red. How do you determine the difference. The extra effort needed for this would make it easier to just type the thing in the first place.

Now, there would probably be more effort devoted to this if it was more necessary, but advances in other forms of data input have come along so quickly that it is not deemed that necessary.

Of course, there are areas where it would be great, even extremely useful or helpful. Situations where you have your hands full or can't look at a screen for input. Helping the disabled etc. But most of these are niche markets which have their own solutions. Maybe some of these are working more towards this, but most environments where computers are used are not good candidates for speech recognition. I prefer my working environment to be quiet. And endless chatter to computers would make crosstalk a realistic problem.

On top of this, unless you are dictating prose to the computer, any other type of input is easier and quicker using keyboard, mouse or touch. I did once try coding using voice input. The whole thing was painful from beginning to end.

查看更多
聊天终结者
5楼-- · 2019-03-09 04:48

Most of the time we human understand based on context. So that a perticular sentence is in harmony with the whole conversation unfortunately computer have a big handicap in this sense. It is just tries to capture the word not whats between it.

we would understand a foreigner whose english accent is very poor may be guess what is he trying to say instead of what is he actually saying.

查看更多
我欲成王,谁敢阻挡
6楼-- · 2019-03-09 04:50

Because if people find it hard to understand other people with a strong accent why do you think computers will be any better at it?

查看更多
萌系小妹纸
7楼-- · 2019-03-09 04:50

Speech synthesis is very complex by itself - many parameters are combined to form the resulting speech. Breaking it apart is hard even for people - sometimes you mishear one word for another.

查看更多
登录 后发表回答