Anyone know a good and effective way to search/match for a byte pattern in an byte[] array and then return the positions.
For example
byte[] pattern = new byte[] {12,3,5,76,8,0,6,125};
byte[] toBeSearched = new byte[] {23,36,43,76,125,56,34,234,12,3,5,76,8,0,6,125,234,56,211,122,22,4,7,89,76,64,12,3,5,76,8,0,6,125}
Originally I posted some old code I used but was curious about Jb Evain's benchmarks. I found that my solution was stupid slow. It appears that bruno conde's SearchBytePattern is the fastest. I could not figure why especially since he uses an Array.Copy and an Extension method. But there is proof in Jb's tests, so kudos to bruno.
I simplified the bits even further, so hopefully this will be the clearest and simplest solution. (All the hard work done by bruno conde) The enhancements are:
converted to extension method
Here is a solution I came up with. I included notes I found along the way with the implementation. It can match forward, backward and with different (in/dec)remement amounts e.g. direction; starting at any offset in the haystack.
Any input would be awesome!
This is my own approach on the topic. I used pointers to ensure that it is faster on larger arrays. This function will return the first occurence of the sequence (which is what I needed in my own case).
I am sure you can modify it a little bit in order to return a list with all occurences.
What I do is fairly simple. I loop through the source array (haystack) until I find the first byte of the pattern (needle). When the first byte is found, I continue checking separately if the next bytes match the next bytes of the pattern. If not, I continue searching as normal, from the index (in the haystack) I was previously on, before trying to match the needle.
So here's the code:
Safe code below:
toBeSearched.Except(pattern) will return you differences toBeSearched.Intersect(pattern) will produce set of intersections Generally, you should look into extended methods within Linq extensions
Use LINQ Methods.
Very simple!
I tried to understand Sanchez's proposal and make faster search.Below code's performance nearly equal.But code is more understandable.