Given a list of words, how would you go about arranging them into a crossword grid?
It wouldn't have to be like a "proper" crossword puzzle which is symmetrical or anything like that: basically just output a starting position and direction for each word.
Would there be any Java examples available?
I actually wrote a crossword generation program about ten years ago (it was cryptic but the same rules would apply for normal crosswords).
It had a list of words (and associated clues) stored in a file sorted by descending usage to date (so that lesser-used words were at the top of the file). A template, basically a bit-mask representing the black and free squares, was chosen randomly from a pool that was provided by the client.
Then, for each non-complete word in the puzzle (basically find the first blank square and see if the one to the right (across-word) or the one underneath (down-word) is also blank), a search was done of the file looking for the first word that fitted, taking into account the letters already in that word. If there was no word that could fit, you just marked the whole word as incomplete and moved on.
At the end would be some uncompleted words which the compiler would have to fill in (and add the word and a clue to the file if desired). If they couldn't come up with any ideas, they could edit the crossword manually to change constraints or just ask for a total re-generation.
Once the word/clue file got up to a certain size (and it was adding 50-100 clues a day for this client), there was rarely a case of more than two or three manual fix ups that had to be done for each crossword.
I would get an index of each letter used by each word to know possible crosses. Then I would choose the largest word and use it as base. Select the next large one and cross it. Rinse and repeat. It's probably an NP problem.
Another idea is creating a genetic algorithm where the strength metric is how many words you can put in the grid.
The hard part I find is when to know a certain list cannot possibly be crossed.
I just recently wrote my own in Python. You can find it here: http://bryanhelmig.com/python-crossword-puzzle-generator/. It doesn't create the dense NYT style crosswords, but the style of crosswords you might find in a child's puzzle book.
Unlike a few algorithms I found out there that implemented a random brute-force method of placing words like a few have suggested, I tried to implement a slightly smarter brute-force approach at word placement. Here's my process:
By the end, you have a decent crossword puzzle or word search puzzle, since they are about the same. It tends to run rather well, but let me know if you have any suggestions on improvement. Bigger grids run exponentially slower; bigger word lists linearly. Bigger word lists also have a much higher chance at better word placement numbers.
I've been thinking about this problem. My sense is that to create a truly dense crossword, you cannot hope that your limited word list is going to be enough. Therefore, you might want to take a dictionary and place it into a "trie" data structure. This will allow you to easily find words that fill the left over spaces. In a trie, it is fairly efficient to implement a traversal that, say, gives you all words of the form "c?t".
So, my general thinking is: create some sort of relatively brute force approach as some described here to create a low-density cross, and fill in the blanks with dictionary words.
If anyone else has taken this approach, please let me know.
I have coded up a 100%
jQuery
solution to this problem.Sample Demo: http://www.earthfluent.com/crossword-puzzle-demo.html
Source Code: https://github.com/HoldOffHunger/jquery-crossword-puzzle-generator
The intent of the algorithm I used:
I will describe the algorithm I used:
Group the words together according to those that share a common letter.
From these groups, build sets of a new data structure ("word blocks"), which is a primary word (that runs through all other words) and then the other words (that run through the primary word).
Start out the crossword puzzle with the very first of these word blocks in the very top-left position of the crossword puzzle.
For the rest of the word blocks, starting from the right-bottom most position of the crossword puzzle, move upward and leftward, until there are no more available slots to fill. If there are more empty columns upwards than leftwards, move upwards, and vice versa.
I'd generate two numbers: Length and Scrabble score. Assume that a low Scrabble score means it's easier to join on (low scores = lots of common letters). Sort the list by length descending and Scrabble score ascending.
Next, just go down the list. If the word doesn't cross with an existing word (check against each word by their length and Scrabble score, respectively), then put it into the queue, and check the next word.
Rinse and repeat, and this should generate a crossword.
Of course, I'm pretty sure that this is O(n!) and it's not guaranteed to complete the crossword for you, but perhaps somebody can improve it.