trie building in java [duplicate]

2019-03-30 08:32发布

This question already has an answer here:

How do I construct a tree from a file ? I want to be able to read them from a file and then add to appropriate level

标签: java file insert
3条回答
爷的心禁止访问
2楼-- · 2019-03-30 08:39

It seems to me that you are trying to implement trie.

Look here for a nice implementation in java: http://www.cs.duke.edu/~ola/courses/cps108/fall96/joggle/trie/Trie.java

查看更多
▲ chillily
3楼-- · 2019-03-30 08:42

Adding

Starting at the root, search for the first (or current) letter. If that letter is found then move to that node and search for the next letter. If the letter is not found, search for a word that matches the current letter, if there is a similar word then add the current letter as a new node and move both words under that, otherwise add the word.

Note: This will result in a tree that is more optimized for searches then the tree shown in the example. (adamant and adapt will be grouped under another 'a' node)

Update: Take a look at the Wikipedia article for Trie

查看更多
SAY GOODBYE
4楼-- · 2019-03-30 08:46

If you have only two levels in the tree before leafs (the actual words), you can simply start with arrays with 28 elements being and translate the letters to the index (i.e. a==1, b==2, etc.). Elements of array can be some set/list that contains the full words. You can lazily create arrays and lists (i.e. create the root array but have nulls for the other arrays and list of words, then you create an array/list when/if needed).

Am I reading rules you should follow correctly?

P.S. I think that using arrays with full size each would not be too wasteful on space as well it should be very fast to address

Update: @user1747976, well each array would take around 28*4 or 28*8 bits + 12 bytes overhead. Hopefully you use compressed ops so it is 28*4+12=116bytes per array. Now it depends if you want to be memory efficient or processing efficient. To be memory efficient, you can use some kind of hashmap instead of arrays but I'm not sure the additional overhead will be less than what you use with arrays. Processing will be for sure worse though. You need to use some clever loop a number of times depending on tree dept requirement. Some ugly pseudo code for inserting into tree:

root=new Object[28];
word="something";
pos = root;
wordInd=1;
for (int i=1; i<=TREE_DEPTH ; i++) {
   targetpos = letterInd(letter(wordInd,word));
   if (i==TREE_DEPTH) {
      if (pos[targetpos] == null) pos[targetpos] = new HashSet<String>();
      (Set) pos[targetpos].add(word);
      break;
   } else {
      if (pos[targetpos] == null) pos[targetpos] = new Object[28];
      wordInd++;
      pos = pos[targetpos];
   }
}

Similar loop you can use for retrieving words.

查看更多
登录 后发表回答