Intuition behind using backtracking (and not DFS)

2019-08-26 09:20发布

问题:

I am solving Word Search question on LeetCode.com:

Given a 2D board and a word, find if the word exists in the grid.

The word can be constructed from letters of sequentially adjacent cell, where "adjacent" cells are those horizontally or vertically neighboring. The same letter cell may not be used more than once.

The solution which I wrote with online help is as follows:

class Solution {
public:

    //compare this with Max Area of Island:
    //they 'look' similar, but this one uses a backtracking approach since we retract when we don't find a solution
    //In case of Max Area Of Island, we are not 'coming back' - technically we do come back due to recursion, but we don't track     
    //that since we don't acutally intend to do anything - we are just counting the 1s.

    bool exist(vector<vector<char>>& board, string& word) {
        if(board.empty()) return false;

        for(int i=0; i<board.size(); i++) {
            for(int j=0; j<board[0].size(); j++) {
                //if(word[0] == board[i][j])
                    if(existUtil(board, word, i, j, 0))    //matching the word[i][j] with 0th character of word
                        return true;
            }
        }

        return false;
    }

    bool existUtil(vector<vector<char>>& board, string& word, int i, int j, int match) {
        if(match==word.size()) return true;
        if(i<0 || i>=board.size() || j<0 || j>=board[0].size()) return false;
        if(board[i][j]!=word[match]) return false;

        board[i][j] = '*';
        bool ans = existUtil(board, word, i+1, j, match+1) || //[i+1,j]
               existUtil(board, word, i-1, j, match+1) || // [i,j+1]
               existUtil(board, word, i, j+1, match+1) || // [i-1,j]
               existUtil(board, word, i, j-1, match+1);   // [i,j-1]
        board[i][j] = word[match];

        return ans;
    }
};

My question is simple - why are we using a backtracking approach and not just a conventional DFS? Pretty similar to what we have done, we could start at each character and do a DFS to determine if we could find the target word. But we are not doing so, why?

I thought a lot about it and came with the following reasoning, but I am not sure - we use a backtracking approach because of the same letter cell may not be used more than once. So, when we are do backtracking, we replace the original character with a '*' and then later resubstitute it when we come back. But this somehow does not feel right, because we could have used a visited matrix instead.

Could someone please explain why we use backtracking and not a DFS?

回答1:

Q: My question is simple - why are we using a backtracking approach and not just a conventional DFS?

Because backtracking is far more efficient for solving this class of problems than the plain DFS.

The difference between DFS and backtracking is subtle, but we can summarize like this: DFS is a technique for searching a graph, while backtracking is a problem solving technique (which consists of DFS + pruning, such programs are called backtrackers). So, DFS visits each node until it finds the required value (in your case the target word), while backtracking is smarter - it doesn't even visit particular branches when it is certain that the target word would not be found there.

Imagine that you have a dictionary of all possible words and searching through the board to find all words that exist on the board (Boggle game). You start to traverse the board and stumble upon the letters 'J','A','C' in that order, so the current prefix is "JAC". Great. Let's look at the neighbors of the letter 'C', e.g. they are 'A', 'Q', 'D', 'F'. What would plain DFS do? It would skip 'A' because it came from that node to 'C', but it would then blindly visit each of the remaining nodes hoping to find some word, even though we know there are no words starting with "JACQ", "JACD" and "JACF". Backtracker would immediately prune branches with "JACQ", "JACD" and "JACF" by e.g. consulting an auxiliary trie data structure built from the dictionary. At some point even DFS would backtrack, but only when it does not have where to go - i.e. all surrounding letters have already been visited.

To conclude - in your example, the conventional DFS would for each node blindly check all neighbor nodes until it finds the target word or until all its neighbors are visited - it would only then backtrack. Backtracker on the other hand constantly checks whether we are on the "right track", and the key line in your code that performs this is:

if (board[i][j] != word[match]) return false;