I have an assignment. I have tried searching through stack overflow.
I need to read a text file full of phone numbers. This part I know how to do.
Next I need to convert the phone numbers to all possible permutations of words based on the following encoding:
E | J N Q | R W X | D S Y | F T | A M | C I V | B K U | L O P | G H Z
e | j n q | r w x | d s y | f t | a m | c i v | b k u | l o p | g h z
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
and check them against a dictionary file which i am given to read. (the reading part I already know how to do)
Can someone please suggest a good algorithm for encoding the phone numbers and checking against a dictionary?
Everything I have tried so far threw a OutOfMemory exception and the assignment specifies that this must not be the case.
Here it is:
static char[][] letters = {"Ee".toCharArray(), // 0
"JNQjnq".toCharArray(), // 1
"RWXrwx".toCharArray() // 2
//.... add remaining
};
static Set<String> words = new HashSet<>(Arrays.asList(
"Stack",
"Overflow",
"foo",
"bar"
//... add other words
));
// main method
public void generateAndCheckAll(String number) {
int[] num = new int[number.length()]; // store number as digits
int[] ind = new int[number.length()]; // store letter indices
int i = 0;
for (char c : number.toCharArray()) { // convert String to int array
num[i++] = c - '0';
}
do {
String word = generateWord(num, ind);
System.out.println(word + (words.contains(word) ? " is contained" : " is not contained"));
} while (next(num, ind));
}
// generate word by number array and index array
public String generateWord(int[] num, int[] ind) {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < ind.length; i++)
sb.append(letters[num[i]][ind[i]]);
return sb.toString();
}
// switch index array to next
public boolean next(int[] num, int[] ind) {
int i = ind.length - 1;
while (i >= 0 && ind[i] == letters[num[i]].length - 1) // search for the index that can be incremented
i--;
if (i < 0)
return false;
ind[i]++; // increment such index
for (int j = i + 1; j < num.length; j++) // change remaining indices to 0
ind[j] = 0;
return true;
}
Main idea here is to store array of letter indices on every iteration and generate a new word based on it. For example, for number 201
and index array [0, 0, 5]
the word REq
will be generated: 0-th letter for digit 2
, 0-th letter for digit 0
and 5-th letter for digit 1
.
Then this word is looked-up in dictionary and the index array is switched to next. Scanning from the end we search index that can be incremented (so that will not overflow number of letters for the digit at the given position): we cannot increment "5" (because there are only 6 letters for digit 1
), but we can increment "0". So index array is changed to [0, 1, 0]
and everything is repeated again and again until all indices are maximum allowed.
To see how it works in action, debug this code in step-by-step mode. It is really easy to understand and apply the same idea to many other similar tasks.
By the way this is not the most effective approach since dictionary size (10-100K I suppose) is very likely to be smaller than combination count (~60M for 10-digit number). If you don't really need to generate all combinations, you may perform preparation of dictionary:
Generate a number for each word in dictionary,
map each generated number to corresponding word (words) and collect those mapping in hash collection.
Then, for number you receive, you just look if there is a word (words) exists for a given number. This method requires long preparation stage and more space, but look-up speed is increased significantly.