I want to find a submatrix in a huge matrix, so I google and find the Baker-Bird Algorithm.
But, unfortunately I cannot understand it very much, and the tutorial about it is rare.
I cannot find some example code to study.
So what I want to ask is there some simple example code or pseudo code that I can study it?
Thx in advance.
Ok, from studying the link Kent Munthe Caspersen gave ( http://www.stringology.org/papers/Zdarek-PhD_thesis-2010.pdf page 30 on), I understand how the Baker-Bird Algorithm works.
For a submatrix to appear in a matrix, its columns must all match individually. You can scan down each column looking for matches, and then scan this post-processed matrix for rows indicating columns consecutively matching at the same spot.
Say we are looking for submatrices of the format
We search down each column for the column-matches 'abc' 'cba' or 'aab' and in a new matrix mark the ends of those complete matches in the corresponding cell - for example with A, B or C. (What the algorithm in the paper does is construct a state machine which transitions to a new state based on the old state number and which letter comes next, and then looks for states that indicate we just matched a column, which is more complex but more efficient as it only has to scan each column once instead of once per different column match we are interested in)
Once we have done this, we scan along each row looking for successive values indicating successive columns matched - in this case, we're looking for the string 'ABC' in a matrix row. If we find it, there was a sub-array match here.
Speedups are attained from using the state machine approach described above, and also from choice of string searching algorithm ( there are many with different time complexities: ( of which there are numerous: http://en.wikipedia.org/wiki/String_searching_algorithm )
(Note that the entire algorithm can, of course, be flipped to do rows first than columns, it's identical.)
What about the example in this PhD thesis p.31-33: http://www.stringology.org/papers/Zdarek-PhD_thesis-2010.pdf