A word is an anagram if the letters in that word can be re-arranged to form a different word.
Task:
- The shortest source code by character count to find all sets of anagrams given a word list.
- Spaces and new lines should be counted as characters
Use the code ruler
---------10--------20--------30--------40--------50--------60--------70--------80--------90--------100-------110-------120
Input:
a list of words from stdin with each word separated by a new line.
e.g.
A
A's
AOL
AOL's
Aachen
Aachen's
Aaliyah
Aaliyah's
Aaron
Aaron's
Abbas
Abbasid
Abbasid's
Output:
All sets of anagrams, with each set separated by a separate line.
Example run:
./anagram < words
marcos caroms macros
lump's plum's
dewar's wader's
postman tampons
dent tend
macho mocha
stoker's stroke's
hops posh shop
chasity scythia
...
I have a 149 char perl solution which I'll post as soon as a few more people post :)
Have fun!
EDIT: Clarifications
- Assume anagrams are case insensitive (i.e. upper and lower case letters are equivalent)
- Only sets with more than 1 item should be printed
- Each set of anagrams should only be printed once
- Each word in an anagram set should only occur once
EDIT2: More Clarifications
- If two words differ only in capitalization, they should be collapsed into the same word, and it's up to you to decide which capitalization scheme to use for the collapsed word
- sets of words only have to end in a new line, as long as each word is separated in some way, e.g. comma separated, or space separated is valid. I understand some languages have quick array printing methods built in so this should allow you to take advantage of that if it doesn't output space separated arrays.
Haskell, 147 chars
prior sizes:
150159charsThis version, at 165 chars satisifies the new, clarified rules:
This version handles:
Perl, 59 characters
Note that this requires Perl 5.10 (for the
say
function).Powershell,
10497918683 charsUpdate for the new requirement (+8 chars):
To exclude the words that only differ in capitalization, we could just remove the duplicates (case-insensitvely) from the input list, i.e.
$input|sort -u
where-u
stands for-unique
.sort
is case-insenstive by default:Explanation of the
[char[]]$_|%{$_+0}|sort
-partIt's a key for the hashtable entry under which anagrams of a word are stored. My initial solution was:
$_.ToLower().ToCharArray()|sort
. Then I discovered I didn't needToLower()
for the key, as hashtable lookups are case-insensitive.[char[]]$_|sort
would be ideal, but sorting of the chars for the key needs to be case-insensitive (otherwiseCab
andabc
would be stored under different keys). Unfortunately,sort
is not case-insenstive for chars (only for strings).What we need is
[string[]][char[]]$_|sort
, but I found a shorter way of converting each char to string, which is to concat something else to it, in this case an integer0
, hence[char[]]$_|%{$_+0}|sort
. This doesn't affect the sorting order, and the actual key ends up being something like:d0 o0 r0 w0
. It's not pretty, but it does the job :)C++, 542 chars
Python, O(n^2)
Python, 167 characters, includes I/O
Without the input code (i.e. if we assume the wordlist already in a list
w
), it's only 134 characters: