Unscrambling words in a sentence using Natural Lan

2019-06-02 07:42发布

问题:

I have a sentence in English. Now I want to jumble the words up and input that set of words into a program which should unscramble the words according to normal rules of English grammar to output the original sentence. I can vaguely assume it would require Natural Language Generation algorithms.

For eg:

Sentence: Mary has gone for a walk with her dog.
Set of words: {has, for, a, with, her, dog, Mary, gone, walk}

The output should be the same sentence.

I can assume only the set of words will never be enough to generate the original sentence. But what more information must be included to revive the original sentence? Please guide me as to where I should be starting with.

回答1:

Language models are things that can take in a text or sentence (any sequence of words) and assign it a probability based on how well the model "recognizes" that text.

To solve your problem, you could take a language model and use it to compute the probability of each possible permutation you can make of the input words. The most probable sentence accord to the model is probably the most coherent one.

For a situation like yours, trying a n-gram model (for n > 2.. I think 2 or 3 should do the trick) or a Hidden Markov model leveraging part of speech tags should do the trick.



回答2:

You will not be able to solve your problem without additional information. Take this example:

{"happy", "you" "are"}

Can you reconstruct the sentence? Is it "You are happy" or is it "Are you happy"? Note that the words are the same but the meaning changes radically. No matter how good algorithm you write it will not be able to reconstruct the sentence if you can not.



回答3:

You need to do following thing to get started :-

  1. Maintain a dictionary of english words classified as nouns,adjectives,verbs,etc.
  2. Build grammer rules for english language which you can get from any english tutorial.
  3. try to rearrange the words to match the grammer rules.

Note:- English is very ambiguous language so you might end with something else.

eg.

grammer rule : article noun verb

input words : dog , barks , the

dictionary lookup : dog => noun , barks => verb , the => article

rearrange the words according to the rule.

There can be mutliple rule and word can also be of multiple type so try all possibilities.