Expected space consumption of skip lists

2019-05-07 06:34发布

问题:

What is the expected space used by the skip list after inserting n elements?

I expect that in the worst case the space consumption may grow indefinitely.

Wikipedia says “Space O(n)”.

How can this be proven one way or another?

回答1:

According to this thesis, which I find more reliable then wikipedia, wikipedia is wrong. Probabilistic Skip List is Theta(nlogn) worst case space complexity.

Despite the fact that on the average the PSL performs reasonably well, in the worst case its Theta(n lg n) space and Theta(n) time complexity are unacceptably high

The worst case is not infinite because you can limit yourself to f(n) number of lists, where f(n) = O(logn), and stop flipping coins when you reached this height. So, if you have f(n) complete rows, you get O(nlogn) total number of nodes, thus the space complexity in this case is O(nlogn), and not O(n).


EDIT:
If you are looking for expected space consumption, and not worst as initially was stated in the question then:
Let's denote "column" as a bottom node and all the nodes "up" from it.

E(#nodes) = Sigma(E(#nodes_for_column_i)) (i in [1,n]) 

The above equation is true because linearity of expected value.

E(#nodes_for_column_i) = 1 + 1/2 + ... + 1/n < 2 (for each i). This is because with probability 1 it has 1 node, with p=1/2, each of these has an extra node. with p'=1/2, each of these has an extra node (total p*p'=1/4) ,.... Thus we can derive:

E(#nodes) = n*E(#nodes_for_each_column) = n*(1+1/2+...+1/n) < 2n = O(n)


回答2:

Let's we have deterministic skiplist with N nodes. In addition to data values, list contains:

N pointers at level 1, N/2 pointers at level 2, N/4 pointers at level 3 and so on...

N + N/2 + N/4 + N/8 +..N/2^k is sum of geometric progression, and its limit is 2N, so maximum memory consumption is N*SizeOf(Data) + 2*N*SizeOf(Pointer) = O(N).

I did not take into account interlevel links, but their count is about of pointer count.



回答3:

Expected Space Complexity

A Skip List has logn layers. Each element in a Skip List appears in one or more layers. To measure the expected space complexity of a Skip List, we can evaluate the expected number of layers that an arbitrary element x appears in.

We know that x has a 100% chance of appearing only in the bottom layer, a 50% chance of appearing in both the bottom and 2nd bottom layers, a 25% chance of appearing in the bottom 3 layers, a 12.5% chance of appearing in the bottom 4 layers, and so on.

Mathematically, we can represent the expected number of layers that x appears in as follows...

Sum(i / 2^(i-1)) from i=1 to logn

... where i is the number of layers that x might appear in.

Intuitively, we can see that the above summation converges to a constant because the denominator grows at a much faster rate than the numerator. You can verify this by plugging the equation in to Wolfram Alpha (the summation converges to 4 for very large values of n). This implies that...

[Sum(i / 2^(i-1)) from i=1 to logn] = O(1)

So, we have demonstrated that storing an arbitrary element inside of an arbitrary Skip List requires constant space on average. To get the space complexity for storing n elements in a Skip List, we simply multiply by n.

n*O(1) = O(n)

Thus, Skip Lists have linear expected space complexity.


Worst-Case Space Complexity

This is actually much easier. We know that there are logn layers, and n elements. In the worst case, every element is in every layer. Hence, O(nlogn).