2D String Matching: Baker-Bird Algorithm [closed]

2020-07-27 06:18发布

I want to find a submatrix in a huge matrix, so I google and find the Baker-Bird Algorithm.

But, unfortunately I cannot understand it very much, and the tutorial about it is rare.

I cannot find some example code to study.

So what I want to ask is there some simple example code or pseudo code that I can study it?

Thx in advance.

标签: c++ algorithm
2条回答
等我变得足够好
2楼-- · 2020-07-27 06:51

Ok, from studying the link Kent Munthe Caspersen gave ( http://www.stringology.org/papers/Zdarek-PhD_thesis-2010.pdf page 30 on), I understand how the Baker-Bird Algorithm works.

For a submatrix to appear in a matrix, its columns must all match individually. You can scan down each column looking for matches, and then scan this post-processed matrix for rows indicating columns consecutively matching at the same spot.

Say we are looking for submatrices of the format

a c a
b b a
c a b

We search down each column for the column-matches 'abc' 'cba' or 'aab' and in a new matrix mark the ends of those complete matches in the corresponding cell - for example with A, B or C. (What the algorithm in the paper does is construct a state machine which transitions to a new state based on the old state number and which letter comes next, and then looks for states that indicate we just matched a column, which is more complex but more efficient as it only has to scan each column once instead of once per different column match we are interested in)

Once we have done this, we scan along each row looking for successive values indicating successive columns matched - in this case, we're looking for the string 'ABC' in a matrix row. If we find it, there was a sub-array match here.

Speedups are attained from using the state machine approach described above, and also from choice of string searching algorithm ( there are many with different time complexities: ( of which there are numerous: http://en.wikipedia.org/wiki/String_searching_algorithm )


(Note that the entire algorithm can, of course, be flipped to do rows first than columns, it's identical.)

查看更多
我欲成王,谁敢阻挡
3楼-- · 2020-07-27 06:53

What about the example in this PhD thesis p.31-33: http://www.stringology.org/papers/Zdarek-PhD_thesis-2010.pdf

查看更多
登录 后发表回答