I needed an efficient sliding window function in Haskell, so I wrote the following:
windows n xz@(x:xs)
| length v < n = []
| otherwise = v : windows n xs
where
v = take n xz
My problem with this is that I think the complexity is O(n*m) where m is the length of the list and n is the window size. You count down the list once for take
, another time for length
, and you do it down the list of essentially m-n times. It seems like it can be more efficient than this, but I'm at a loss for how to make it more linear. Any takers?
You can use
Seq
fromData.Sequence
, which has O(1) enqueue and dequeue at both ends:Note that if you materialize the entire result your algorithm is necessarily O(N*M) since that is the size of your result. Using
Seq
just improves performance by a constant factor.Example use:
First let's get the windows without worrying about the short ones at the end:
Now we want to get rid of the short ones without checking the length of every one.
Since we know they are at the end, we could lose them like this:
But that's not great since we still go through xs an extra time to get its length. It also doesn't work on infinite lists, which your original solution did.
Instead let's write a function for using one list as a ruler to measure the amount to take from another:
Now we can write this:
Works on infinite lists too:
As Gabriel Gonzalez says, the time complexity is no better if you want to use the whole result. But if you only use some of the windows, we now manage to avoid doing the work of
take
andlength
on the ones you don't use.For the sliding window I also used unboxed Vetors as length, take, drop as well as splitAt are O(1) operations.
The code from Thomas M. DuBuisson is a by n shifted window, not a sliding, except if n =1. Therefore a (++) is missing, however this has a cost of O(n+m). Therefore careful, where you put it.
I tried it out with
+RTS -sstderr
and:and got real time 1.051s and 96.9% usage, keeping in mind that after the sliding window two O(m) operations are performed.
If you want O(1) length then why not use a structure that provides O(1) length? Assuming you aren't looking for windows from an infinite list, consider using:
Conversation of each window from a vector to a list might bite you some, I won't hazard an optimistic guess there, but I will bet that the performance is better than the list-only version.
You can't get better than O(m*n), since this is the size of the output data structure.
But you can avoid checking the lengths of the windows if you reverse the order of operations: First create n shifted lists and then just zip them together. Zipping will get rid of those that don't have enough elements automatically.
Zipping a list of lists is just a transposition, but unlike
transpose
fromData.List
it throws away outputs that would have less than n elements.Now it's easy to make the window function: Take m lists, each shifted by 1, and just zip them:
Works also for infinite lists.