Trie implementation with wildcard values

2019-06-21 22:47发布

问题:

I'm implementing an algorithm to do directory matching. So I'm given a set of valid paths that can include wildcards (denoted by "X"). Then when I pass in an input I need to know if that input matches with one of the paths in my valid set. I'm running into a problem with the wildcards when a wildcard value that is passed in actually matches with another valid value. Here is an example:

Set of valid paths:

/guest
/guest/X
/X/guest
/member
/member/friends
/member/X
/member/X/friends

Example input:

/member/garbage/friends    
/member/friends/friends

These should both return true, however only the first one does. In the first case because "garbage" does not match with another other possible paths but there is an option for a wildcard at this point it will skip it and move on, then it sees "friends" and it knows it's a match. However, my second case does not work because it sees friends and goes down a different path in my trie, not the wildcard path. Because there is a valid path with "friends" in this position it thinks it will be that. Then it sees "friends" again but from this point in the trie there are no valid paths with "friends" so it returns false.

My question is, how can I get it to recognize the other valid path even when it goes down the wrong branch in the trie. My search code and example trie diagram is below.

The search algorithm for my trie is as follows:

private TrieNode searchNode(String input) {
    Map<String, TrieNode> children = root.children;
    TrieNode t = null;

    // break up input into individual tokens
    String[] tokenizedLine = input.split("/");
    for(int i = 0; i < tokenizedLine.length; i++){
        String current = tokenizedLine[i];

        if(children.containsKey(current)) {
            t = children.get(current);
            children = t.children;
        }else if(children.containsKey("X")){
            t = children.get("X");
            children = t.children;
        }else{
            return null;
        }
    }

    return t;
}

An image of the trie that would be built with my sample path set: When I input /member/friends/friends my algorithm is going down the highlighted path and stopping because it does not see another "friends" after, but how can I get it to recognize the first "friends" as a wildcard value instead, so then it will continue and see the second "friends" after the wildcard?

Thanks for any help!

回答1:

Figured it out. I implemented some backtracking to keep track of the last node where it saw two possible paths. If it finds a dead end on one path it will go back to the last time it saw two possible paths and try the other. New Algorithm looks like this:

private TrieNode searchNode(String input) {
        Map<String, TrieNode> children = root.children;
        TrieNode t = null;

        // Variables for backtrack function
        TrieNode alternativeWildcardNode = null;
        int wildcardIndex = 0;
        Map<String, TrieNode> wildcardChildren = root.children;

        // break up input into individual tokens
        String[] tokenizedLine = input.split("/");
        for(int i = 0; i < tokenizedLine.length; i++){
            String current = tokenizedLine[i];

            //System.out.println(current);
            //System.out.println(Integer.toString(i));

            if(children.containsKey(current) && children.containsKey("X")) {
                // store current variable state in case we need to back track here
                alternativeWildcardNode = children.get("X");
                wildcardIndex = i;
                wildcardChildren = alternativeWildcardNode.children;

                t = children.get(current);
                children = t.children;
            }else if(children.containsKey(current)) {
                t = children.get(current);
                children = t.children;
            }else if(children.containsKey("X")){
                t = children.get("X");
                children = t.children;
            }else if(alternativeWildcardNode != null){
                // if we've reached a branch with no match, but had a possible wildcard previously
                // call reset state to the wildcard option instead of static
                t = alternativeWildcardNode;
                alternativeWildcardNode = null;
                i = wildcardIndex;
                children = wildcardChildren;
            }else{
                return null;
            }
        }

        return t;
    }