A StringToken Parser which gives Google Search sty

2019-02-02 01:51发布

Seeking a method to:

Take whitespace separated tokens in a String; return a suggested Word


ie:
Google Search can take "fonetic wrd nterpreterr",
and atop of the result page it shows "Did you mean: phonetic word interpreter"

A solution in any of the C* languages or Java would be preferred.


Are there any existing Open Libraries which perform such functionality?

Or is there a way to Utilise a Google API to request a suggested word?

8条回答
萌系小妹纸
2楼-- · 2019-02-02 01:58

You could plug Lucene, which has a dictionary facility implementing the Levenshtein distance method.

Here's an example from the Wiki, where 2 is the distance.

String[] l=spellChecker.suggestSimilar("sevanty", 2);
//l[0] = "seventy"
查看更多
一纸荒年 Trace。
3楼-- · 2019-02-02 01:58
叼着烟拽天下
4楼-- · 2019-02-02 02:06

If you have a dictionary stored as a trie, there is a fairly straightforward way to find best-matching entries, where characters can be inserted, deleted, or replaced.

void match(trie t, char* w, string s, int budget){
  if (budget < 0) return;
  if (*w=='\0') print s;
  foreach (char c, subtrie t1 in t){
    /* try matching or replacing c */
    match(t1, w+1, s+c, (*w==c ? budget : budget-1));
    /* try deleting c */
    match(t1, w, s, budget-1);
  }
  /* try inserting *w */
  match(t, w+1, s + *w, budget-1);
}

The idea is that first you call it with a budget of zero, and see if it prints anything out. Then try a budget of 1, and so on, until it prints out some matches. The bigger the budget the longer it takes. You might want to only go up to a budget of 2.

Added: It's not too hard to extend this to handle common prefixes and suffixes. For example, English prefixes like "un", "anti" and "dis" can be in the dictionary, and can then link back to the top of the dictionary. For suffixes like "ism", "'s", and "ed" there can be a separate trie containing just the suffixes, and most words can link to that suffix trie. Then it can handle strange words like "antinationalizationalization".

查看更多
闹够了就滚
5楼-- · 2019-02-02 02:07

You can use the yahoo web service here: http://developer.yahoo.com/search/web/V1/spellingSuggestion.html

However it's only a web service... (i.e. there are no APIs for other language etc..) but it outputs JSON or XML, so... pretty easy to adapt to any language...

查看更多
Emotional °昔
6楼-- · 2019-02-02 02:07

You can also use the Google API's to spell check. There is an ASP implementation here (I'm not to credit for this, though).

查看更多
疯言疯语
7楼-- · 2019-02-02 02:08

First off:

Use the one of your choice. I suspect it runs the query against a spell-checking engine with a word limit of exactly one, it then does nothing if the entire query is valid, otherwise it replaces each word with that word's best match. In other words, the following algorithm (an empty return string means that the query had no problems):

startup()
{
   set the spelling engines word suggestion limit to 1
}

option 1()
{
   int currentPosition = engine.NextWord(start the search at word 0, querystring);

   if(currentPosition == -1)
      return empty string; // Query is a-ok.

   while(currentPosition != -1)
   {
       queryString = engine.ReplaceWord(engine.CurrentWord, queryString, the suggestion with index 0);
       currentPosition = engine.NextWord(currentPosition, querystring);
   }

   return queryString;
}
查看更多
登录 后发表回答