How to find list of possible words from a letter m

2019-01-01 03:17发布

站内文章 / 前沿技术

62 0

梦该遗忘

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Lately I have been playing a game on my iPhone called Scramble. Some of you may know this game as Boggle. Essentially, when the game starts you get a matrix of letters like so:

F X I E
A M L O
E W B X
A S T U

The goal of the game is to find as many words as you can that can be formed by chaining letters together. You can start with any letter, and all the letters that surround it are fair game, and then once you move on to the next letter, all the letters that surround that letter are fair game, except for any previously used letters. So in the grid above, for example, I could come up with the words LOB, TUX, SEA, FAME, etc. Words must be at least 3 characters, and no more than NxN characters, which would be 16 in this game but can vary in some implementations. While this game is fun and addictive, I am apparently not very good at it and I wanted to cheat a little bit by making a program that would give me the best possible words (the longer the word the more points you get).

Sample Boggle http://www.boggled.org/sample.gif

I am, unfortunately, not very good with algorithms or their efficiencies and so forth. My first attempt uses a dictionary such as this one (~2.3MB) and does a linear search trying to match combinations with dictionary entries. This takes a very long time to find the possible words, and since you only get 2 minutes per round, it is simply not adequate.

I am interested to see if any Stackoverflowers can come up with more efficient solutions. I am mostly looking for solutions using the Big 3 Ps: Python, PHP, and Perl, although anything with Java or C++ is cool too, since speed is essential.

CURRENT SOLUTIONS:

Adam Rosenfield, Python, ~20s
John Fouhy, Python, ~3s
Kent Fredric, Perl, ~1s
Darius Bacon, Python, ~1s
rvarcher, VB.NET (live link), ~1s
Paolo Bergantino, PHP (live link), ~5s (~2s locally)

BOUNTY:

I am adding a bounty to this question as my way of saying thanks to all the people who pitched in with their programs. Unfortunately I can only give the accepted answer to one of you, so I\'ll measure who has the fastest boggle solver 7 days from now and award the winner the bounty.

Bounty awarded. Thanks to everyone that participated.

回答1:

My answer works like the others here, but I\'ll post it because it looks a bit faster than the other Python solutions, from setting up the dictionary faster. (I checked this against John Fouhy\'s solution.) After setup, the time to solve is down in the noise.

grid = \"fxie amlo ewbx astu\".split()
nrows, ncols = len(grid), len(grid[0])

# A dictionary word that could be a solution must use only the grid\'s
# letters and have length >= 3. (With a case-insensitive match.)
import re
alphabet = \'\'.join(set(\'\'.join(grid)))
bogglable = re.compile(\'[\' + alphabet + \']{3,}$\', re.I).match

words = set(word.rstrip(\'\\n\') for word in open(\'words\') if bogglable(word))
prefixes = set(word[:i] for word in words
               for i in range(2, len(word)+1))

def solve():
    for y, row in enumerate(grid):
        for x, letter in enumerate(row):
            for result in extending(letter, ((x, y),)):
                yield result

def extending(prefix, path):
    if prefix in words:
        yield (prefix, path)
    for (nx, ny) in neighbors(path[-1]):
        if (nx, ny) not in path:
            prefix1 = prefix + grid[ny][nx]
            if prefix1 in prefixes:
                for result in extending(prefix1, path + ((nx, ny),)):
                    yield result

def neighbors((x, y)):
    for nx in range(max(0, x-1), min(x+2, ncols)):
        for ny in range(max(0, y-1), min(y+2, nrows)):
            yield (nx, ny)

Sample usage:

# Print a maximal-length word and its path:
print max(solve(), key=lambda (word, path): len(word))

Edit: Filter out words less than 3 letters long.

Edit 2: I was curious why Kent Fredric\'s Perl solution was faster; it turns out to use regular-expression matching instead of a set of characters. Doing the same in Python about doubles the speed.

回答2:

The fastest solution you\'re going to get will probably involve storing your dictionary in a trie. Then, create a queue of triplets (x, y, s), where each element in the queue corresponds to a prefix s of a word which can be spelled in the grid, ending at location (x, y). Initialize the queue with N x N elements (where N is the size of your grid), one element for each square in the grid. Then, the algorithm proceeds as follows:

While the queue is not empty:
  Dequeue a triple (x, y, s)
  For each square (x\', y\') with letter c adjacent to (x, y):
    If s+c is a word, output s+c
    If s+c is a prefix of a word, insert (x\', y\', s+c) into the queue

If you store your dictionary in a trie, testing if s+c is a word or a prefix of a word can be done in constant time (provided you also keep some extra metadata in each queue datum, such as a pointer to the current node in the trie), so the running time of this algorithm is O(number of words that can be spelled).

[Edit] Here\'s an implementation in Python that I just coded up:

#!/usr/bin/python

class TrieNode:
    def __init__(self, parent, value):
        self.parent = parent
        self.children = [None] * 26
        self.isWord = False
        if parent is not None:
            parent.children[ord(value) - 97] = self

def MakeTrie(dictfile):
    dict = open(dictfile)
    root = TrieNode(None, \'\')
    for word in dict:
        curNode = root
        for letter in word.lower():
            if 97 <= ord(letter) < 123:
                nextNode = curNode.children[ord(letter) - 97]
                if nextNode is None:
                    nextNode = TrieNode(curNode, letter)
                curNode = nextNode
        curNode.isWord = True
    return root

def BoggleWords(grid, dict):
    rows = len(grid)
    cols = len(grid[0])
    queue = []
    words = []
    for y in range(cols):
        for x in range(rows):
            c = grid[y][x]
            node = dict.children[ord(c) - 97]
            if node is not None:
                queue.append((x, y, c, node))
    while queue:
        x, y, s, node = queue[0]
        del queue[0]
        for dx, dy in ((1, 0), (1, -1), (0, -1), (-1, -1), (-1, 0), (-1, 1), (0, 1), (1, 1)):
            x2, y2 = x + dx, y + dy
            if 0 <= x2 < cols and 0 <= y2 < rows:
                s2 = s + grid[y2][x2]
                node2 = node.children[ord(grid[y2][x2]) - 97]
                if node2 is not None:
                    if node2.isWord:
                        words.append(s2)
                    queue.append((x2, y2, s2, node2))

    return words

Example usage:

d = MakeTrie(\'/usr/share/dict/words\')
print(BoggleWords([\'fxie\',\'amlo\',\'ewbx\',\'astu\'], d))

Output:

[\'fa\', \'xi\', \'ie\', \'io\', \'el\', \'am\', \'ax\', \'ae\', \'aw\', \'mi\', \'ma\', \'me\', \'lo\', \'li\', \'oe\', \'ox\', \'em\', \'ea\', \'ea\', \'es\', \'wa\', \'we\', \'wa\', \'bo\', \'bu\', \'as\', \'aw\', \'ae\', \'st\', \'se\', \'sa\', \'tu\', \'ut\', \'fam\', \'fae\', \'imi\', \'eli\', \'elm\', \'elb\', \'ami\', \'ama\', \'ame\', \'aes\', \'awl\', \'awa\', \'awe\', \'awa\', \'mix\', \'mim\', \'mil\', \'mam\', \'max\', \'mae\', \'maw\', \'mew\', \'mem\', \'mes\', \'lob\', \'lox\', \'lei\', \'leo\', \'lie\', \'lim\', \'oil\', \'olm\', \'ewe\', \'eme\', \'wax\', \'waf\', \'wae\', \'waw\', \'wem\', \'wea\', \'wea\', \'was\', \'waw\', \'wae\', \'bob\', \'blo\', \'bub\', \'but\', \'ast\', \'ase\', \'asa\', \'awl\', \'awa\', \'awe\', \'awa\', \'aes\', \'swa\', \'swa\', \'sew\', \'sea\', \'sea\', \'saw\', \'tux\', \'tub\', \'tut\', \'twa\', \'twa\', \'tst\', \'utu\', \'fama\', \'fame\', \'ixil\', \'imam\', \'amli\', \'amil\', \'ambo\', \'axil\', \'axle\', \'mimi\', \'mima\', \'mime\', \'milo\', \'mile\', \'mewl\', \'mese\', \'mesa\', \'lolo\', \'lobo\', \'lima\', \'lime\', \'limb\', \'lile\', \'oime\', \'oleo\', \'olio\', \'oboe\', \'obol\', \'emim\', \'emil\', \'east\', \'ease\', \'wame\', \'wawa\', \'wawa\', \'weam\', \'west\', \'wese\', \'wast\', \'wase\', \'wawa\', \'wawa\', \'boil\', \'bolo\', \'bole\', \'bobo\', \'blob\', \'bleo\', \'bubo\', \'asem\', \'stub\', \'stut\', \'swam\', \'semi\', \'seme\', \'seam\', \'seax\', \'sasa\', \'sawt\', \'tutu\', \'tuts\', \'twae\', \'twas\', \'twae\', \'ilima\', \'amble\', \'axile\', \'awest\', \'mamie\', \'mambo\', \'maxim\', \'mease\', \'mesem\', \'limax\', \'limes\', \'limbo\', \'limbu\', \'obole\', \'emesa\', \'embox\', \'awest\', \'swami\', \'famble\', \'mimble\', \'maxima\', \'embolo\', \'embole\', \'wamble\', \'semese\', \'semble\', \'sawbwa\', \'sawbwa\']

Notes: This program doesn\'t output 1-letter words, or filter by word length at all. That\'s easy to add but not really relevant to the problem. It also outputs some words multiple times if they can be spelled in multiple ways. If a given word can be spelled in many different ways (worst case: every letter in the grid is the same (e.g. \'A\') and a word like \'aaaaaaaaaa\' is in your dictionary), then the running time will get horribly exponential. Filtering out duplicates and sorting is trivial to due after the algorithm has finished.

回答3:

For a dictionary speedup, there is one general transformation/process you can do to greatly reduce the dictionary comparisons ahead of time.

Given that the above grid contains only 16 characters, some of them duplicate, you can greatly reduce the number of total keys in your dictionary by simply filtering out entries that have unattainable characters.

I thought this was the obvious optimization but seeing nobody did it I\'m mentioning it.

It reduced me from a dictionary of 200,000 keys to only 2,000 keys simply during the input pass. This at the very least reduces memory overhead, and that\'s sure to map to a speed increase somewhere as memory isn\'t infinitely fast.

Perl Implementation

My implementation is a bit top-heavy because I placed importance on being able to know the exact path of every extracted string, not just the validity therein.

I also have a few adaptions in there that would theoretically permit a grid with holes in it to function, and grids with different sized lines ( assuming you get the input right and it lines up somehow ).

The early-filter is by far the most significant bottleneck in my application, as suspected earlier, commenting out that line bloats it from 1.5s to 7.5s.

Upon execution it appears to think all the single digits are on their own valid words, but I\'m pretty sure thats due to how the dictionary file works.

Its a bit bloated, but at least I reuse Tree::Trie from cpan

Some of it was inspired partially by the existing implementations, some of it I had in mind already.

Constructive Criticism and ways it could be improved welcome ( /me notes he never searched CPAN for a boggle solver, but this was more fun to work out )

updated for new criteria

#!/usr/bin/perl 

use strict;
use warnings;

{

  # this package manages a given path through the grid.
  # Its an array of matrix-nodes in-order with
  # Convenience functions for pretty-printing the paths
  # and for extending paths as new paths.

  # Usage:
  # my $p = Prefix->new(path=>[ $startnode ]);
  # my $c = $p->child( $extensionNode );
  # print $c->current_word ;

  package Prefix;
  use Moose;

  has path => (
      isa     => \'ArrayRef[MatrixNode]\',
      is      => \'rw\',
      default => sub { [] },
  );
  has current_word => (
      isa        => \'Str\',
      is         => \'rw\',
      lazy_build => 1,
  );

  # Create a clone of this object
  # with a longer path

  # $o->child( $successive-node-on-graph );

  sub child {
      my $self    = shift;
      my $newNode = shift;
      my $f       = Prefix->new();

      # Have to do this manually or other recorded paths get modified
      push @{ $f->{path} }, @{ $self->{path} }, $newNode;
      return $f;
  }

  # Traverses $o->path left-to-right to get the string it represents.

  sub _build_current_word {
      my $self = shift;
      return join q{}, map { $_->{value} } @{ $self->{path} };
  }

  # Returns  the rightmost node on this path

  sub tail {
      my $self = shift;
      return $self->{path}->[-1];
  }

  # pretty-format $o->path

  sub pp_path {
      my $self = shift;
      my @path =
        map { \'[\' . $_->{x_position} . \',\' . $_->{y_position} . \']\' }
        @{ $self->{path} };
      return \"[\" . join( \",\", @path ) . \"]\";
  }

  # pretty-format $o
  sub pp {
      my $self = shift;
      return $self->current_word . \' => \' . $self->pp_path;
  }

  __PACKAGE__->meta->make_immutable;
}

{

  # Basic package for tracking node data
  # without having to look on the grid.
  # I could have just used an array or a hash, but that got ugly.

# Once the matrix is up and running it doesn\'t really care so much about rows/columns,
# Its just a sea of points and each point has adjacent points.
# Relative positioning is only really useful to map it back to userspace

  package MatrixNode;
  use Moose;

  has x_position => ( isa => \'Int\', is => \'rw\', required => 1 );
  has y_position => ( isa => \'Int\', is => \'rw\', required => 1 );
  has value      => ( isa => \'Str\', is => \'rw\', required => 1 );
  has siblings   => (
      isa     => \'ArrayRef[MatrixNode]\',
      is      => \'rw\',
      default => sub { [] }
  );

# Its not implicitly uni-directional joins. It would be more effient in therory
# to make the link go both ways at the same time, but thats too hard to program around.
# and besides, this isn\'t slow enough to bother caring about.

  sub add_sibling {
      my $self    = shift;
      my $sibling = shift;
      push @{ $self->siblings }, $sibling;
  }

  # Convenience method to derive a path starting at this node

  sub to_path {
      my $self = shift;
      return Prefix->new( path => [$self] );
  }
  __PACKAGE__->meta->make_immutable;

}

{

  package Matrix;
  use Moose;

  has rows => (
      isa     => \'ArrayRef\',
      is      => \'rw\',
      default => sub { [] },
  );

  has regex => (
      isa        => \'Regexp\',
      is         => \'rw\',
      lazy_build => 1,
  );

  has cells => (
      isa        => \'ArrayRef\',
      is         => \'rw\',
      lazy_build => 1,
  );

  sub add_row {
      my $self = shift;
      push @{ $self->rows }, [@_];
  }

  # Most of these functions from here down are just builder functions,
  # or utilities to help build things.
  # Some just broken out to make it easier for me to process.
  # All thats really useful is add_row
  # The rest will generally be computed, stored, and ready to go
  # from ->cells by the time either ->cells or ->regex are called.

  # traverse all cells and make a regex that covers them.
  sub _build_regex {
      my $self  = shift;
      my $chars = q{};
      for my $cell ( @{ $self->cells } ) {
          $chars .= $cell->value();
      }
      $chars = \"[^$chars]\";
      return qr/$chars/i;
  }

  # convert a plain cell ( ie: [x][y] = 0 )
  # to an intelligent cell ie: [x][y] = object( x, y )
  # we only really keep them in this format temporarily
  # so we can go through and tie in neighbouring information.
  # after the neigbouring is done, the grid should be considered inoperative.

  sub _convert {
      my $self = shift;
      my $x    = shift;
      my $y    = shift;
      my $v    = $self->_read( $x, $y );
      my $n    = MatrixNode->new(
          x_position => $x,
          y_position => $y,
          value      => $v,
      );
      $self->_write( $x, $y, $n );
      return $n;
  }

# go through the rows/collums presently available and freeze them into objects.

  sub _build_cells {
      my $self = shift;
      my @out  = ();
      my @rows = @{ $self->{rows} };
      for my $x ( 0 .. $#rows ) {
          next unless defined $self->{rows}->[$x];
          my @col = @{ $self->{rows}->[$x] };
          for my $y ( 0 .. $#col ) {
              next unless defined $self->{rows}->[$x]->[$y];
              push @out, $self->_convert( $x, $y );
          }
      }
      for my $c (@out) {
          for my $n ( $self->_neighbours( $c->x_position, $c->y_position ) ) {
              $c->add_sibling( $self->{rows}->[ $n->[0] ]->[ $n->[1] ] );
          }
      }
      return \\@out;
  }

  # given x,y , return array of points that refer to valid neighbours.
  sub _neighbours {
      my $self = shift;
      my $x    = shift;
      my $y    = shift;
      my @out  = ();
      for my $sx ( -1, 0, 1 ) {
          next if $sx + $x < 0;
          next if not defined $self->{rows}->[ $sx + $x ];
          for my $sy ( -1, 0, 1 ) {
              next if $sx == 0 && $sy == 0;
              next if $sy + $y < 0;
              next if not defined $self->{rows}->[ $sx + $x ]->[ $sy + $y ];
              push @out, [ $sx + $x, $sy + $y ];
          }
      }
      return @out;
  }

  sub _has_row {
      my $self = shift;
      my $x    = shift;
      return defined $self->{rows}->[$x];
  }

  sub _has_cell {
      my $self = shift;
      my $x    = shift;
      my $y    = shift;
      return defined $self->{rows}->[$x]->[$y];
  }

  sub _read {
      my $self = shift;
      my $x    = shift;
      my $y    = shift;
      return $self->{rows}->[$x]->[$y];
  }

  sub _write {
      my $self = shift;
      my $x    = shift;
      my $y    = shift;
      my $v    = shift;
      $self->{rows}->[$x]->[$y] = $v;
      return $v;
  }

  __PACKAGE__->meta->make_immutable;
}

use Tree::Trie;

sub readDict {
  my $fn = shift;
  my $re = shift;
  my $d  = Tree::Trie->new();

  # Dictionary Loading
  open my $fh, \'<\', $fn;
  while ( my $line = <$fh> ) {
      chomp($line);

 # Commenting the next line makes it go from 1.5 seconds to 7.5 seconds. EPIC.
      next if $line =~ $re;    # Early Filter
      $d->add( uc($line) );
  }
  return $d;
}

sub traverseGraph {
  my $d     = shift;
  my $m     = shift;
  my $min   = shift;
  my $max   = shift;
  my @words = ();

  # Inject all grid nodes into the processing queue.

  my @queue =
    grep { $d->lookup( $_->current_word ) }
    map  { $_->to_path } @{ $m->cells };

  while (@queue) {
      my $item = shift @queue;

      # put the dictionary into \"exact match\" mode.

      $d->deepsearch(\'exact\');

      my $cword = $item->current_word;
      my $l     = length($cword);

      if ( $l >= $min && $d->lookup($cword) ) {
          push @words,
            $item;    # push current path into \"words\" if it exactly matches.
      }
      next if $l > $max;

      # put the dictionary into \"is-a-prefix\" mode.
      $d->deepsearch(\'boolean\');

    siblingloop: foreach my $sibling ( @{ $item->tail->siblings } ) {
          foreach my $visited ( @{ $item->{path} } ) {
              next siblingloop if $sibling == $visited;
          }

          # given path y , iterate for all its end points
          my $subpath = $item->child($sibling);

          # create a new path for each end-point
          if ( $d->lookup( $subpath->current_word ) ) {

             # if the new path is a prefix, add it to the bottom of the queue.
              push @queue, $subpath;
          }
      }
  }
  return \\@words;
}

sub setup_predetermined { 
  my $m = shift; 
  my $gameNo = shift;
  if( $gameNo == 0 ){
      $m->add_row(qw( F X I E ));
      $m->add_row(qw( A M L O ));
      $m->add_row(qw( E W B X ));
      $m->add_row(qw( A S T U ));
      return $m;
  }
  if( $gameNo == 1 ){
      $m->add_row(qw( D G H I ));
      $m->add_row(qw( K L P S ));
      $m->add_row(qw( Y E U T ));
      $m->add_row(qw( E O R N ));
      return $m;
  }
}
sub setup_random { 
  my $m = shift; 
  my $seed = shift;
  srand $seed;
  my @letters = \'A\' .. \'Z\' ; 
  for( 1 .. 4 ){ 
      my @r = ();
      for( 1 .. 4 ){
          push @r , $letters[int(rand(25))];
      }
      $m->add_row( @r );
  }
}

# Here is where the real work starts.

my $m = Matrix->new();
setup_predetermined( $m, 0 );
#setup_random( $m, 5 );

my $d = readDict( \'dict.txt\', $m->regex );
my $c = scalar @{ $m->cells };    # get the max, as per spec

print join \",\\n\", map { $_->pp } @{
  traverseGraph( $d, $m, 3, $c ) ;
};

Arch/execution info for comparison:

model name      : Intel(R) Core(TM)2 Duo CPU     T9300  @ 2.50GHz
cache size      : 6144 KB
Memory usage summary: heap total: 77057577, heap peak: 11446200, stack peak: 26448
       total calls   total memory   failed calls
 malloc|     947212       68763684              0
realloc|      11191        1045641              0  (nomove:9063, dec:4731, free:0)
 calloc|     121001        7248252              0
   free|     973159       65854762

Histogram for block sizes:
  0-15         392633  36% ==================================================
 16-31          43530   4% =====
 32-47          50048   4% ======
 48-63          70701   6% =========
 64-79          18831   1% ==
 80-95          19271   1% ==
 96-111        238398  22% ==============================
112-127          3007  <1% 
128-143        236727  21% ==============================

More Mumblings on that Regex Optimization

The regex optimization I use is useless for multi-solve dictionaries, and for multi-solve you\'ll want a full dictionary, not a pre-trimmed one.

However, that said, for one-off solves, its really fast. ( Perl regex are in C! :) )

Here is some varying code additions:

sub readDict_nofilter {
  my $fn = shift;
  my $re = shift;
  my $d  = Tree::Trie->new();

  # Dictionary Loading
  open my $fh, \'<\', $fn;
  while ( my $line = <$fh> ) {
      chomp($line);
      $d->add( uc($line) );
  }
  return $d;
}

sub benchmark_io { 
  use Benchmark qw( cmpthese :hireswallclock );
   # generate a random 16 character string 
   # to simulate there being an input grid. 
  my $regexen = sub { 
      my @letters = \'A\' .. \'Z\' ; 
      my @lo = ();
      for( 1..16 ){ 
          push @lo , $_ ; 
      }
      my $c  = join \'\', @lo;
      $c = \"[^$c]\";
      return qr/$c/i;
  };
  cmpthese( 200 , { 
      filtered => sub { 
          readDict(\'dict.txt\', $regexen->() );
      }, 
      unfiltered => sub {
          readDict_nofilter(\'dict.txt\');
      }
  });
}

           s/iter unfiltered   filtered
unfiltered   8.16         --       -94%
filtered    0.464      1658%         --

_{ps: 8.16 * 200 = 27 minutes.}

回答4:

You could split the problem up into two pieces:

Some kind of search algorithm that will enumerate possible strings in the grid.
A way of testing whether a string is a valid word.

Ideally, (2) should also include a way of testing whether a string is a prefix of a valid word – this will allow you to prune your search and save a whole heap of time.

Adam Rosenfield\'s Trie is a solution to (2). It\'s elegant and probably what your algorithms specialist would prefer, but with modern languages and modern computers, we can be a bit lazier. Also, as Kent suggests, we can reduce our dictionary size by discarding words that have letters not present in the grid. Here\'s some python:

def make_lookups(grid, fn=\'dict.txt\'):
    # Make set of valid characters.
    chars = set()
    for word in grid:
        chars.update(word)

    words = set(x.strip() for x in open(fn) if set(x.strip()) <= chars)
    prefixes = set()
    for w in words:
        for i in range(len(w)+1):
            prefixes.add(w[:i])

    return words, prefixes

Wow; constant-time prefix testing. It takes a couple of seconds to load the dictionary you linked, but only a couple :-) (notice that words <= prefixes)

Now, for part (1), I\'m inclined to think in terms of graphs. So I\'ll build a dictionary that looks something like this:

graph = { (x, y):set([(x0,y0), (x1,y1), (x2,y2)]), }

i.e. graph[(x, y)] is the set of coordinates that you can reach from position (x, y). I\'ll also add a dummy node None which will connect to everything.

Building it\'s a bit clumsy, because there\'s 8 possible positions and you have to do bounds checking. Here\'s some correspondingly-clumsy python code:

def make_graph(grid):
    root = None
    graph = { root:set() }
    chardict = { root:\'\' }

    for i, row in enumerate(grid):
        for j, char in enumerate(row):
            chardict[(i, j)] = char
            node = (i, j)
            children = set()
            graph[node] = children
            graph[root].add(node)
            add_children(node, children, grid)

    return graph, chardict

def add_children(node, children, grid):
    x0, y0 = node
    for i in [-1,0,1]:
        x = x0 + i
        if not (0 <= x < len(grid)):
            continue
        for j in [-1,0,1]:
            y = y0 + j
            if not (0 <= y < len(grid[0])) or (i == j == 0):
                continue

            children.add((x,y))

This code also builds up a dictionary mapping (x,y) to the corresponding character. This lets me turn a list of positions into a word:

def to_word(chardict, pos_list):
    return \'\'.join(chardict[x] for x in pos_list)

Finally, we do a depth-first search. The basic procedure is:

The search arrives at a particular node.
Check if the path so far could be part of a word. If not, don\'t explore this branch any further.
Check if the path so far is a word. If so, add to the list of results.
Explore all children not part of the path so far.

Python:

def find_words(graph, chardict, position, prefix, results, words, prefixes):
    \"\"\" Arguments:
      graph :: mapping (x,y) to set of reachable positions
      chardict :: mapping (x,y) to character
      position :: current position (x,y) -- equals prefix[-1]
      prefix :: list of positions in current string
      results :: set of words found
      words :: set of valid words in the dictionary
      prefixes :: set of valid words or prefixes thereof
    \"\"\"
    word = to_word(chardict, prefix)

    if word not in prefixes:
        return

    if word in words:
        results.add(word)

    for child in graph[position]:
        if child not in prefix:
            find_words(graph, chardict, child, prefix+[child], results, words, prefixes)

Run the code as:

grid = [\'fxie\', \'amlo\', \'ewbx\', \'astu\']
g, c = make_graph(grid)
w, p = make_lookups(grid)
res = set()
find_words(g, c, None, [], res, w, p)

and inspect res to see the answers. Here\'s a list of words found for your example, sorted by size:

 [\'a\', \'b\', \'e\', \'f\', \'i\', \'l\', \'m\', \'o\', \'s\', \'t\',
 \'u\', \'w\', \'x\', \'ae\', \'am\', \'as\', \'aw\', \'ax\', \'bo\',
 \'bu\', \'ea\', \'el\', \'em\', \'es\', \'fa\', \'ie\', \'io\', \'li\',
 \'lo\', \'ma\', \'me\', \'mi\', \'oe\', \'ox\', \'sa\', \'se\', \'st\',
 \'tu\', \'ut\', \'wa\', \'we\', \'xi\', \'aes\', \'ame\', \'ami\',
 \'ase\', \'ast\', \'awa\', \'awe\', \'awl\', \'blo\', \'but\', \'elb\',
 \'elm\', \'fae\', \'fam\', \'lei\', \'lie\', \'lim\', \'lob\', \'lox\',
 \'mae\', \'maw\', \'mew\', \'mil\', \'mix\', \'oil\', \'olm\', \'saw\',
 \'sea\', \'sew\', \'swa\', \'tub\', \'tux\', \'twa\', \'wae\', \'was\',
 \'wax\', \'wem\', \'ambo\', \'amil\', \'amli\', \'asem\', \'axil\',
 \'axle\', \'bleo\', \'boil\', \'bole\', \'east\', \'fame\', \'limb\',
 \'lime\', \'mesa\', \'mewl\', \'mile\', \'milo\', \'oime\', \'sawt\',
 \'seam\', \'seax\', \'semi\', \'stub\', \'swam\', \'twae\', \'twas\',
 \'wame\', \'wase\', \'wast\', \'weam\', \'west\', \'amble\', \'awest\',
 \'axile\', \'embox\', \'limbo\', \'limes\', \'swami\', \'embole\',
 \'famble\', \'semble\', \'wamble\']

The code takes (literally) a couple of seconds to load the dictionary, but the rest is instant on my machine.

回答5:

My attempt in Java. It takes about 2 s to read file and build trie, and around 50 ms to solve the puzzle. I used the dictionary linked in the question (it has a few words that I didn\'t know exist in English such as fae, ima)

0 [main] INFO gineer.bogglesolver.util.Util  - Reading the dictionary
2234 [main] INFO gineer.bogglesolver.util.Util  - Finish reading the dictionary
2234 [main] INFO gineer.bogglesolver.Solver  - Found: FAM
2234 [main] INFO gineer.bogglesolver.Solver  - Found: FAME
2234 [main] INFO gineer.bogglesolver.Solver  - Found: FAMBLE
2234 [main] INFO gineer.bogglesolver.Solver  - Found: FAE
2234 [main] INFO gineer.bogglesolver.Solver  - Found: IMA
2234 [main] INFO gineer.bogglesolver.Solver  - Found: ELI
2234 [main] INFO gineer.bogglesolver.Solver  - Found: ELM
2234 [main] INFO gineer.bogglesolver.Solver  - Found: ELB
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AXIL
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AXILE
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AXLE
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AMI
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AMIL
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AMLI
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AME
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AMBLE
2234 [main] INFO gineer.bogglesolver.Solver  - Found: AMBO
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AES
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWL
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWEST
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MIX
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MIL
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MILE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MILO
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MAX
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MAE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MAW
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MEW
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MEWL
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MES
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MESA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: MWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIM
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIMA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIMAX
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIME
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIMES
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIMB
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIMBO
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LIMBU
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LEI
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LEO
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LOB
2250 [main] INFO gineer.bogglesolver.Solver  - Found: LOX
2250 [main] INFO gineer.bogglesolver.Solver  - Found: OIME
2250 [main] INFO gineer.bogglesolver.Solver  - Found: OIL
2250 [main] INFO gineer.bogglesolver.Solver  - Found: OLE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: OLM
2250 [main] INFO gineer.bogglesolver.Solver  - Found: EMIL
2250 [main] INFO gineer.bogglesolver.Solver  - Found: EMBOLE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: EMBOX
2250 [main] INFO gineer.bogglesolver.Solver  - Found: EAST
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAF
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAX
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAME
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAMBLE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WEA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WEAM
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WEM
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WEA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WES
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WEST
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAS
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WASE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: WAST
2250 [main] INFO gineer.bogglesolver.Solver  - Found: BLEO
2250 [main] INFO gineer.bogglesolver.Solver  - Found: BLO
2250 [main] INFO gineer.bogglesolver.Solver  - Found: BOIL
2250 [main] INFO gineer.bogglesolver.Solver  - Found: BOLE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: BUT
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AES
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWL
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AWEST
2250 [main] INFO gineer.bogglesolver.Solver  - Found: ASE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: ASEM
2250 [main] INFO gineer.bogglesolver.Solver  - Found: AST
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SEA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SEAX
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SEAM
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SEMI
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SEMBLE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SEW
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SEA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SWAM
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SWAMI
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SAW
2250 [main] INFO gineer.bogglesolver.Solver  - Found: SAWT
2250 [main] INFO gineer.bogglesolver.Solver  - Found: STU
2250 [main] INFO gineer.bogglesolver.Solver  - Found: STUB
2250 [main] INFO gineer.bogglesolver.Solver  - Found: TWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: TWAE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: TWA
2250 [main] INFO gineer.bogglesolver.Solver  - Found: TWAE
2250 [main] INFO gineer.bogglesolver.Solver  - Found: TWAS
2250 [main] INFO gineer.bogglesolver.Solver  - Found: TUB
2250 [main] INFO gineer.bogglesolver.Solver  - Found: TUX

Source code consists of 6 classes. I\'ll post them below (if this is not the right practice on StackOverflow, please tell me).

gineer.bogglesolver.Main

package gineer.bogglesolver;

import org.apache.log4j.BasicConfigurator;
import org.apache.log4j.Logger;

public class Main
{
    private final static Logger logger = Logger.getLogger(Main.class);

    public static void main(String[] args)
    {
        BasicConfigurator.configure();

        Solver solver = new Solver(4,
                        \"FXIE\" +
                        \"AMLO\" +
                        \"EWBX\" +
                        \"ASTU\");
        solver.solve();

    }
}

gineer.bogglesolver.Solver

package gineer.bogglesolver;

import gineer.bogglesolver.trie.Trie;
import gineer.bogglesolver.util.Constants;
import gineer.bogglesolver.util.Util;
import org.apache.log4j.Logger;

public class Solver
{
    private char[] puzzle;
    private int maxSize;

    private boolean[] used;
    private StringBuilder stringSoFar;

    private boolean[][] matrix;
    private Trie trie;

    private final static Logger logger = Logger.getLogger(Solver.class);

    public Solver(int size, String puzzle)
    {
        trie = Util.getTrie(size);
        matrix = Util.connectivityMatrix(size);

        maxSize = size * size;
        stringSoFar = new StringBuilder(maxSize);
        used = new boolean[maxSize];

        if (puzzle.length() == maxSize)
        {
            this.puzzle = puzzle.toCharArray();
        }
        else
        {
            logger.error(\"The puzzle size does not match the size specified: \" + puzzle.length());
            this.puzzle = puzzle.substring(0, maxSize).toCharArray();
        }
    }

    public void solve()
    {
        for (int i = 0; i < maxSize; i++)
        {
            traverseAt(i);
        }
    }

    private void traverseAt(int origin)
    {
        stringSoFar.append(puzzle[origin]);
        used[origin] = true;

        //Check if we have a valid word
        if ((stringSoFar.length() >= Constants.MINIMUM_WORD_LENGTH) && (trie.containKey(stringSoFar.toString())))
        {
            logger.info(\"Found: \" + stringSoFar.toString());
        }

        //Find where to go next
        for (int destination = 0; destination < maxSize; destination++)
        {
            if (matrix[origin][destination] && !used[destination] && trie.containPrefix(stringSoFar.toString() + puzzle[destination]))
            {
                traverseAt(destination);
            }
        }

        used[origin] = false;
        stringSoFar.deleteCharAt(stringSoFar.length() - 1);
    }

}

gineer.bogglesolver.trie.Node

package gineer.bogglesolver.trie;

import gineer.bogglesolver.util.Constants;

class Node
{
    Node[] children;
    boolean isKey;

    public Node()
    {
        isKey = false;
        children = new Node[Constants.NUMBER_LETTERS_IN_ALPHABET];
    }

    public Node(boolean key)
    {
        isKey = key;
        children = new Node[Constants.NUMBER_LETTERS_IN_ALPHABET];
    }

    /**
     Method to insert a string to Node and its children

     @param key the string to insert (the string is assumed to be uppercase)
     @return true if the node or one of its children is changed, false otherwise
     */
    public boolean insert(String key)
    {
        //If the key is empty, this node is a key
        if (key.length() == 0)
        {
            if (isKey)
                return false;
            else
            {
                isKey = true;
                return true;
            }
        }
        else
        {//otherwise, insert in one of its child

            int childNodePosition = key.charAt(0) - Constants.LETTER_A;
            if (children[childNodePosition] == null)
            {
                children[childNodePosition] = new Node();
                children[childNodePosition].insert(key.substring(1));
                return true;
            }
            else
            {
                return children[childNodePosition].insert(key.substring(1));
            }
        }
    }

    /**
     Returns whether key is a valid prefix for certain key in this trie.
     For example: if key \"hello\" is in this trie, tests with all prefixes \"hel\", \"hell\", \"hello\" return true

     @param prefix the prefix to check
     @return true if the prefix is valid, false otherwise
     */
    public boolean containPrefix(String prefix)
    {
        //If the prefix is empty, return true
        if (prefix.length() == 0)
        {
            return true;
        }
        else
        {//otherwise, check in one of its child
            int childNodePosition = prefix.charAt(0) - Constants.LETTER_A;
            return children[childNodePosition] != null && children[childNodePosition].containPrefix(prefix.substring(1));
        }
    }

    /**
     Returns whether key is a valid key in this trie.
     For example: if key \"hello\" is in this trie, tests with all prefixes \"hel\", \"hell\" return false

     @param key the key to check
     @return true if the key is valid, false otherwise
     */
    public boolean containKey(String key)
    {
        //If the prefix is empty, return true
        if (key.length() == 0)
        {
            return isKey;
        }
        else
        {//otherwise, check in one of its child
            int childNodePosition = key.charAt(0) - Constants.LETTER_A;
            return children[childNodePosition] != null && children[childNodePosition].containKey(key.substring(1));
        }
    }

    public boolean isKey()
    {
        return isKey;
    }

    public void setKey(boolean key)
    {
        isKey = key;
    }
}

gineer.bogglesolver.trie.Trie

package gineer.bogglesolver.trie;

public class Trie
{
    Node root;

    public Trie()
    {
        this.root = new Node();
    }

    /**
     Method to insert a string to Node and its children

     @param key the string to insert (the string is assumed to be uppercase)
     @return true if the node or one of its children is changed, false otherwise
     */
    public boolean insert(String key)
    {
        return root.insert(key.toUpperCase());
    }

    /**
     Returns whether key is a valid prefix for certain key in this trie.
     For example: if key \"hello\" is in this trie, tests with all prefixes \"hel\", \"hell\", \"hello\" return true

     @param prefix the prefix to check
     @return true if the prefix is valid, false otherwise
     */
    public boolean containPrefix(String prefix)
    {
        return root.containPrefix(prefix.toUpperCase());
    }

    /**
     Returns whether key is a valid key in this trie.
     For example: if key \"hello\" is in this trie, tests with all prefixes \"hel\", \"hell\" return false

     @param key the key to check
     @return true if the key is valid, false otherwise
     */
    public boolean containKey(String key)
    {
        return root.containKey(key.toUpperCase());
    }


}

gineer.bogglesolver.util.Constants

package gineer.bogglesolver.util;

public class Constants
{

    public static final int NUMBER_LETTERS_IN_ALPHABET = 26;
    public static final char LETTER_A = \'A\';
    public static final int MINIMUM_WORD_LENGTH = 3;
    public static final int DEFAULT_PUZZLE_SIZE = 4;
}

gineer.bogglesolver.util.Util

package gineer.bogglesolver.util;

import gineer.bogglesolver.trie.Trie;
import org.apache.log4j.Logger;

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class Util
{
    private final static Logger logger = Logger.getLogger(Util.class);
    private static Trie trie;
    private static int size = Constants.DEFAULT_PUZZLE_SIZE;

    /**
     Returns the trie built from the dictionary.  The size is used to eliminate words that are too long.

     @param size the size of puzzle.  The maximum lenght of words in the returned trie is (size * size)
     @return the trie that can be used for puzzle of that size
     */
    public static Trie getTrie(int size)
    {
        if ((trie != null) && size == Util.size)
            return trie;

        trie = new Trie();
        Util.size = size;

        logger.info(\"Reading the dictionary\");
        final File file = new File(\"dictionary.txt\");
        try
        {
            Scanner scanner = new Scanner(file);
            final int maxSize = size * size;
            while (scanner.hasNext())
            {
                String line = scanner.nextLine().replaceAll(\"[^\\\\p{Alpha}]\", \"\");

                if (line.length() <= maxSize)
                    trie.insert(line);
            }
        }
        catch (FileNotFoundException e)
        {
            logger.error(\"Cannot open file\", e);
        }

        logger.info(\"Finish reading the dictionary\");
        return trie;
    }

    static boolean[] connectivityRow(int x, int y, int size)
    {
        boolean[] squares = new boolean[size * size];
        for (int offsetX = -1; offsetX <= 1; offsetX++)
        {
            for (int offsetY = -1; offsetY <= 1; offsetY++)
            {
                final int calX = x + offsetX;
                final int calY = y + offsetY;
                if ((calX >= 0) && (calX < size) && (calY >= 0) && (calY < size))
                    squares[calY * size + calX] = true;
            }
        }

        squares[y * size + x] = false;//the current x, y is false

        return squares;
    }

    /**
     Returns the matrix of connectivity between two points.  Point i can go to point j iff matrix[i][j] is true
     Square (x, y) is equivalent to point (size * y + x).  For example, square (1,1) is point 5 in a puzzle of size 4

     @param size the size of the puzzle
     @return the connectivity matrix
     */
    public static boolean[][] connectivityMatrix(int size)
    {
        boolean[][] matrix = new boolean[size * size][];
        for (int x = 0; x < size; x++)
        {
            for (int y = 0; y < size; y++)
            {
                matrix[y * size + x] = connectivityRow(x, y, size);
            }
        }
        return matrix;
    }
}

回答6:

I think you will probably spend most of your time trying to match words that can\'t possibly be built by your letter grid. So, the first thing I would do is try to speed up that step and that should get you most of the way there.

For this, I would re-express the grid as a table of possible \"moves\" that you index by the letter-transition you are looking at.

Start by assigning each letter a number from your entire alphabet (A=0, B=1, C=2, ... and so forth).

Let\'s take this example:

h b c d
e e g h
l l k l
m o f p

And for now, lets use the alphabet of the letters we have (usually you\'d probably want to use the same whole alphabet every time):

 b | c | d | e | f | g | h | k | l | m |  o |  p
---+---+---+---+---+---+---+---+---+---+----+----
 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11

Then you make a 2D boolean array that tells whether you have a certain letter transition available:

     |  0  1  2  3  4  5  6  7  8  9 10 11  <- from letter
     |  b  c  d  e  f  g  h  k  l  m  o  p
-----+--------------------------------------
 0 b |     T     T     T  T     
 1 c |  T     T  T     T  T
 2 d |     T           T  T
 3 e |  T  T     T     T  T  T  T
 4 f |                       T  T     T  T
 5 g |  T  T  T  T        T  T  T
 6 h |  T  T  T  T     T     T  T
 7 k |           T  T  T  T     T     T  T
 8 l |           T  T  T  T  T  T  T  T  T
 9 m |                          T     T
10 o |              T        T  T  T
11 p |              T        T  T
 ^
 to letter

Now go through your word list and convert the words to transitions:

hello (6, 3, 8, 8, 10):
6 -> 3, 3 -> 8, 8 -> 8, 8 -> 10

Then check if these transitions are allowed by looking them up in your table:

[6][ 3] : T
[3][ 8] : T
[8][ 8] : T
[8][10] : T

If they are all allowed, there\'s a chance that this word might be found.

For example the word \"helmet\" can be ruled out on the 4th transition (m to e: helMEt), since that entry in your table is false.

And the word hamster can be ruled out, since the first (h to a) transition is not allowed (doesn\'t even exist in your table).

Now, for the probably very few remaining words that you didn\'t eliminate, try to actually find them in the grid the way you\'re doing it now or as suggested in some of the other answers here. This is to avoid false positives that result from jumps between identical letters in your grid. For example the word \"help\" is allowed by the table, but not by the grid.

Some further performance improvement tips on this idea:

Instead of using a 2D array, use a 1D array and simply compute the index of the second letter yourself. So, instead of a 12x12 array like above, make a 1D array of length 144. If you then always use the same alphabet (i.e. a 26x26 = 676x1 array for the standard english alphabet), even if not all letters show up in your grid, you can pre-compute the indices into this 1D array that you need to test to match your dictionary words. For example, the indices for \'hello\' in the example above would be
```
hello (6, 3, 8, 8, 10):
42 (from 6 + 3x12), 99, 104, 128
-> \"hello\" will be stored as 42, 99, 104, 128 in the dictionary
```
Extend the idea to a 3D table (expressed as a 1D array), i.e. all allowed 3-letter combinations. That way you can eliminate even more words immediately and you reduce the number of array lookups for each word by 1: For \'hello\', you only need 3 array lookups: hel, ell, llo. It will be very quick to build this table, by the way, as there are only 400 possible 3-letter-moves in your grid.

Pre-compute the indices of the moves in your grid that you need to include in your table. For the example above, you need to set the following entries to \'True\':

(0,0) (0,1) -> here: h, b : [6][0]
(0,0) (1,0) -> here: h, e : [6][3]
(0,0) (1,1) -> here: h, e : [6][3]
(0,1) (0,0) -> here: b, h : [0][6]
(0,1) (0,2) -> here: b, c : [0][1]
.
:

Also represent your game grid in a 1-D array with 16 entries and have the table pre-computed in 3. contain the indices into this array.

I\'m sure if you use this approach you can get your code to run insanely fast, if you have the dictionary pre-computed and already loaded into memory.

BTW: Another nice thing to do, if you are building a game, is to run these sort of things immediately in the background. Start generating and solving the first game while the user is still looking at the title screen on your app and getting his finger into position to press \"Play\". Then generate and solve the next game as the user plays the previous one. That should give you a lot of time to run your code.

(I like this problem, so I\'ll probably be tempted to implement my proposal in Java sometime in the next days to see how it would actually perform... I\'ll post the code here once I do.)

UPDATE:

Ok, I had some time today and implemented this idea in Java:

class DictionaryEntry {
  public int[] letters;
  public int[] triplets;
}

class BoggleSolver {

  // Constants
  final int ALPHABET_SIZE = 5;  // up to 2^5 = 32 letters
  final int BOARD_SIZE    = 4;  // 4x4 board
  final int[] moves = {-BOARD_SIZE-1, -BOARD_SIZE, -BOARD_SIZE+1, 
                                  -1,                         +1,
                       +BOARD_SIZE-1, +BOARD_SIZE, +BOARD_SIZE+1};


  // Technically constant (calculated here for flexibility, but should be fixed)
  DictionaryEntry[] dictionary; // Processed word list
  int maxWordLength = 0;
  int[] boardTripletIndices; // List of all 3-letter moves in board coordinates

  DictionaryEntry[] buildDictionary(String fileName) throws IOException {
    BufferedReader fileReader = new BufferedReader(new FileReader(fileName));
    String word = fileReader.readLine();
    ArrayList<DictionaryEntry> result = new ArrayList<DictionaryEntry>();
    while (word!=null) {
      if (word.length()>=3) {
        word = word.toUpperCase();
        if (word.length()>maxWordLength) maxWordLength = word.length();
        DictionaryEntry entry = new DictionaryEntry();
        entry.letters  = new int[word.length()  ];
        entry.triplets = new int[word.length()-2];
        int i=0;
        for (char letter: word.toCharArray()) {
          entry.letters[i] = (byte) letter - 65; // Convert ASCII to 0..25
          if (i>=2)
            entry.triplets[i-2] = (((entry.letters[i-2]  << ALPHABET_SIZE) +
                                     entry.letters[i-1]) << ALPHABET_SIZE) +
                                     entry.letters[i];
          i++;
        }
        result.add(entry);
      }
      word = fileReader.readLine();
    }
    return result.toArray(new DictionaryEntry[result.size()]);
  }

  boolean isWrap(int a, int b) { // Checks if move a->b wraps board edge (like 3->4)
    return Math.abs(a%BOARD_SIZE-b%BOARD_SIZE)>1;
  }

  int[] buildTripletIndices() {
    ArrayList<Integer> result = new ArrayList<Integer>();
    for (int a=0; a<BOARD_SIZE*BOARD_SIZE; a++)
      for (int bm: moves) {
        int b=a+bm;
        if ((b>=0) && (b<board.length) && !isWrap(a, b))
          for (int cm: moves) {
            int c=b+cm;
            if ((c>=0) && (c<board.length) && (c!=a) && !isWrap(b, c)) {
              result.add(a);
              result.add(b);
              result.add(c);
            }
          }
      }
    int[] result2 = new int[result.size()];
    int i=0;
    for (Integer r: result) result2[i++] = r;
    return result2;
  }


  // Variables that depend on the actual game layout
  int[] board = new int[BOARD_SIZE*BOARD_SIZE]; // Letters in board
  boolean[] possibleTriplets = new boolean[1 << (ALPHABET_SIZE*3)];

  DictionaryEntry[] candidateWords;
  int candidateCount;

  int[] usedBoardPositions;

  DictionaryEntry[] foundWords;
  int foundCount;

  void initializeBoard(String[] letters) {
    for (int row=0; row<BOARD_SIZE; row++)
      for (int col=0; col<BOARD_SIZE; col++)
        board[row*BOARD_SIZE + col] = (byte) letters[row].charAt(col) - 65;
  }

  void setPossibleTriplets() {
    Arrays.fill(possibleTriplets, false); // Reset list
    int i=0;
    while (i<boardTripletIndices.length) {
      int triplet = (((board[boardTripletIndices[i++]]  << ALPHABET_SIZE) +
                       board[boardTripletIndices[i++]]) << ALPHABET_SIZE) +
                       board[boardTripletIndices[i++]];
      possibleTriplets[triplet] = true; 
    }
  }

  void checkWordTriplets() {
    candidateCount = 0;
    for (DictionaryEntry entry: dictionary) {
      boolean ok = true;
      int len = entry.triplets.length;
      for (int t=0; (t<len) && ok; t++)
        ok = possibleTriplets[entry.triplets[t]];
      if (ok) candidateWords[candidateCount++] = entry;
    }
  }

  void checkWords() { // Can probably be optimized a lot
    foundCount = 0;
    for (int i=0; i<candidateCount; i++) {
      DictionaryEntry candidate = candidateWords[i];
      for (int j=0; j<board.length; j++)
        if (board[j]==candidate.letters[0]) { 
          usedBoardPositions[0] = j;
          if (checkNextLetters(candidate, 1, j)) {
            foundWords[foundCount++] = candidate;
            break;
          }
        }
    }
  }

  boolean checkNextLetters(DictionaryEntry candidate, int letter, int pos) {
    if (letter==candidate.letters.length) return true;
    int match = candidate.letters[letter];
    for (int move: moves) {
      int next=pos+move;
      if ((next>=0) && (next<board.length) && (board[next]==match) && !isWrap(pos, next)) {
        boolean ok = true;
        for (int i=0; (i<letter) && ok; i++)
          ok = usedBoardPositions[i]!=next;
        if (ok) {
          usedBoardPositions[letter] = next;
          if (checkNextLetters(candidate, letter+1, next)) return true;
        }
      }
    }   
    return false;
  }


  // Just some helper functions
  String formatTime(long start, long end, long repetitions) {
    long time = (end-start)/repetitions;
    return time/1000000 + \".\" + (time/100000) % 10 + \"\" + (time/10000) % 10 + \"ms\";
  }

  String getWord(DictionaryEntry entry) {
    char[] result = new char[entry.letters.length];
    int i=0;
    for (int letter: entry.letters)
      result[i++] = (char) (letter+97);
    return new String(result);
  }

  void run() throws IOException {
    long start = System.nanoTime();

    // The following can be pre-computed and should be replaced by constants
    dictionary = buildDictionary(\"C:/TWL06.txt\");
    boardTripletIndices = buildTripletIndices();
    long precomputed = System.nanoTime();


    // The following only needs to run once at the beginning of the program
    candidateWords     = new DictionaryEntry[dictionary.length]; // WAAAY too generous
    foundWords         = new DictionaryEntry[dictionary.length]; // WAAAY too generous
    usedBoardPositions = new int[maxWordLength];
    long initialized = System.nanoTime(); 

    for (int n=1; n<=100; n++) {
      // The following needs to run again for every new board
      initializeBoard(new String[] {\"DGHI\",
                                    \"KLPS\",
                                    \"YEUT\",
                                    \"EORN\"});
      setPossibleTriplets();
      checkWordTriplets();
      checkWords();
    }
    long solved = System.nanoTime();


    // Print out result and statistics
    System.out.println(\"Precomputation finished in \" + formatTime(start, precomputed, 1)+\":\");
    System.out.println(\"  Words in the dictionary: \"+dictionary.length);
    System.out.println(\"  Longest word:            \"+maxWordLength+\" letters\");
    System.out.println(\"  Number of triplet-moves: \"+boardTripletIndices.length/3);
    System.out.println();

    System.out.println(\"Initialization finished in \" + formatTime(precomputed, initialized, 1));
    System.out.println();

    System.out.println(\"Board solved in \"+formatTime(initialized, solved, 100)+\":\");
    System.out.println(\"  Number of candidates: \"+candidateCount);
    System.out.println(\"  Number of actual words: \"+foundCount);
    System.out.println();

    System.out.println(\"Words found:\");
    int w=0;
    System.out.print(\"  \");
    for (int i=0; i<foundCount; i++) {
      System.out.print(getWord(foundWords[i]));
      w++;
      if (w==10) {
        w=0;
        System.out.println(); System.out.print(\"  \");
      } else
        if (i<foundCount-1) System.out.print(\", \");
    }
    System.out.println();
  }

  public static void main(String[] args) throws IOException {
    new BoggleSolver().run();
  }
}

Here are some results:

For the grid from the picture posted in the original question (DGHI...):

Precomputation finished in 239.59ms:
  Words in the dictionary: 178590
  Longest word:            15 letters
  Number of triplet-moves: 408

Initialization finished in 0.22ms

Board solved in 3.70ms:
  Number of candidates: 230
  Number of actual words: 163 

Words found:
  eek, eel, eely, eld, elhi, elk, ern, erupt, erupts, euro
  eye, eyer, ghi, ghis, glee, gley, glue, gluer, gluey, glut
  gluts, hip, hiply, hips, his, hist, kelp, kelps, kep, kepi
  kepis, keps, kept, kern, key, kye, lee, lek, lept, leu
  ley, lunt, lunts, lure, lush, lust, lustre, lye, nus, nut
  nuts, ore, ort, orts, ouph, ouphs, our, oust, out, outre
  outs, oyer, pee, per, pert, phi, phis, pis, pish, plus
  plush, ply, plyer, psi, pst, pul, pule, puler, pun, punt
  punts, pur, pure, puree, purely, pus, push, put, puts, ree
  rely, rep, reply, reps, roe, roue, roup, roups, roust, rout
  routs, rue, rule, ruly, run, runt, runts, rupee, rush, rust
  rut, ruts, ship, shlep, sip, sipe, spue, spun, spur, spurn
  spurt, strep, stroy, stun, stupe, sue, suer, sulk, sulker, sulky
  sun, sup, supe, super, sure, surely, tree, trek, trey, troupe
  troy, true, truly, tule, tun, tup, tups, turn, tush, ups
  urn, uts, yeld, yelk, yelp, yelps, yep, yeps, yore, you
  your, yourn, yous

For the letters posted as the example in the original question (FXIE...)

Precomputation finished in 239.68ms:
  Words in the dictionary: 178590
  Longest word:            15 letters
  Number of triplet-moves: 408

Initialization finished in 0.21ms

Board solved in 3.69ms:
  Number of candidates: 87
  Number of actual words: 76

Words found:
  amble, ambo, ami, amie, asea, awa, awe, awes, awl, axil
  axile, axle, boil, bole, box, but, buts, east, elm, emboli
  fame, fames, fax, lei, lie, lima, limb, limbo, limbs, lime
  limes, lob, lobs, lox, mae, maes, maw, maws, max, maxi
  mesa, mew, mewl, mews, mil, mile, milo, mix, oil, ole
  sae, saw, sea, seam, semi, sew, stub, swam, swami, tub
  tubs, tux, twa, twae, twaes, twas, uts, wae, waes, wamble
  wame, wames, was, wast, wax, west

For the following 5x5-grid:

R P R I T
A H H L N
I E T E P
Z R Y S G
O G W E Y

it gives this:

Precomputation finished in 240.39ms:
  Words in the dictionary: 178590
  Longest word:            15 letters
  Number of triplet-moves: 768

Initialization finished in 0.23ms

Board solved in 3.85ms:
  Number of candidates: 331
  Number of actual words: 240

Words found:
  aero, aery, ahi, air, airt, airth, airts, airy, ear, egest
  elhi, elint, erg, ergo, ester, eth, ether, eye, eyen, eyer
  eyes, eyre, eyrie, gel, gelt, gelts, gen, gent, gentil, gest
  geste, get, gets, gey, gor, gore, gory, grey, greyest, greys
  gyre, gyri, gyro, hae, haet, haets, hair, hairy, hap, harp
  heap, hear, heh, heir, help, helps, hen, hent, hep, her
  hero, hes, hest, het, hetero, heth, hets, hey, hie, hilt
  hilts, hin, hint, hire, hit, inlet, inlets, ire, leg, leges
  legs, lehr, lent, les, lest, let, lethe, lets, ley, leys
  lin, line, lines, liney, lint, lit, neg, negs, nest, nester
  net, nether, nets, nil, nit, ogre, ore, orgy, ort, orts
  pah, pair, par, peg, pegs, peh, pelt, pelter, peltry, pelts
  pen, pent, pes, pest, pester, pesty, pet, peter, pets, phi
  philter, philtre, phiz, pht, print, pst, rah, rai, rap, raphe
  raphes, reap, rear, rei, ret, rete, rets, rhaphe, rhaphes, rhea
  ria, rile, riles, riley, rin, rye, ryes, seg, sel, sen
  sent, senti, set, sew, spelt, spelter, spent, splent, spline, splint
  split, stent, step, stey, stria, striae, sty, stye, tea, tear
  teg, tegs, tel, ten, tent, thae, the, their, then, these
  thesp, they, thin, thine, thir, thirl, til, tile, tiles, tilt
  tilter, tilth, tilts, tin, tine, tines, tirl, trey, treys, trog
  try, tye, tyer, tyes, tyre, tyro, west, wester, wry, wryest
  wye, wyes, wyte, wytes, yea, yeah, year, yeh, yelp, yelps
  yen, yep, yeps, yes, yester, yet, yew, yews, zero, zori

For this I used the TWL06 Tournament Scrabble Word List, since the link in the original question no longer works. This file is 1.85MB, so it\'s a little bit shorter. And the buildDictionary function throws out all words with less than 3 letters.

Here are a couple of observations about the performance of this:

It\'s about 10 times slower than the reported performance of Victor Nicollet\'s OCaml implementation. Whether this is caused by the different algorithm, the shorter dictionary he used, the fact that his code is compiled and mine runs in a Java virtual machine, or the performance of our computers (mine is an Intel Q6600 @ 2.4MHz running WinXP), I don\'t know. But it\'s much faster than the results for the other implementations quoted at the end of the original question. So, whether this algorithm is superior to the trie dictionary or not, I don\'t know at this point

标签： algorithm puzzle boggle

梦该遗忘

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~