I'm trying to shuffle a linked list using a divide-and-conquer algorithm that randomly shuffles a linked list in linearithmic (n log n) time and logarithmic (log n) extra space.
I'm aware that I can do a Knuth shuffle similar to that could be used in a simple array of values, but I'm not sure how I would do this with divide-and-conquer. What I mean is, what am I actually dividing? Do I just divide to each individual node in the list and then randomly assemble the list back together using some random value?
Or do I give each node a random number and then do a mergesort on the nodes based on the random numbers?
Bottom up merge sort without compares. while merging don't do any comparison just swap the elements.
What about the following? Perform the same procedure as merge sort. When merging, instead of selecting an element (one-by-one) from the two lists in sorted order, flip a coin. Choose whether to pick an element from the first or from the second list based on the result of the coin flip.
Algorithm.
The key point for space is that splitting the list into two does not require any extra space. The only extra space we need is to maintain log n elements on the stack during recursion.
The point with the dummy node is to realize that inserting and removing a dummy element keeps the distribution of the elements uniform.
Analysis. Why is the distribution uniform? After the final merge, the probability
P_i(n)
of any given number ending up in the positioni
is as follows. Either it was:i
-th place in its own list, and the list won the coin toss the firsti
times, the probability of this is1/2^i
;i-1
-st place in its own list, and the list won the coin tossi-1
times including the last one and lost once, the probability of this is(i-1) choose 1
times1/2^i
;i-2
-nd place in its own list, and the list won the coin tossi-2
times including the last one and lost twice, the probability of this is(i-1) choose 2
times1/2^i
;So the probability
Inductively, you can show that
P_i(n) = 1/n
. I let you verify the base case and assume thatP_j(n/2) = 2/n
. The term\sum_{j=0}^{i-1} (i-1 choose j)
is exactly the number ofi-1
-bit binary numbers, i.e.2^{i-1}
. So we getI hope this makes sense. The only assumption we need is that
n
is even, and that the two lists are shuffled uniformly. This is achieved by adding (and then removing) the dummy node.P.S. My original intuition was nowhere near rigorous, but I list it just in case. Imagine we assign numbers between 1 and n at random to the elements of the list. And now we run a merge sort with respect to these numbers. At any given step of the merge, it needs to decide which of the heads of the two lists is smaller. But the probability of one being greater than the other should be exactly 1/2, so we can simulate this by flipping a coin.
P.P.S. Is there a way to embed LaTeX here?
You can actually do better than that: the best list shuffle algorithm is O(n log n) time and just O(1) space. (You can also shuffle in O(n) time and O(n) space by constructing a pointer array for the list, shuffling it in place using Knuth and re-threading the list accordingly.)
Complexity proof
To see why O(n log n) time is minimal for O(1) space, observe that:
Linked-list data structure (because Python)
Merge-style algorithm
Here is an O(n log n) time and O(1) space algorithm based on iterative merge sort. The basic idea is simple: shuffle the left half, then the right half, then merge them by randomly selecting from the two lists. Two things worth noting:
Another algorithm
Another interesting O(n log n) algorithm that produces not-quite-uniform shuffles involves simply riffle shuffling the list 3/2 log_2(n) times. As described in http://en.wikipedia.org/wiki/Gilbert%E2%80%93Shannon%E2%80%93Reeds_model, this leaves only a constant number of bits of information.
Code
Up shuffle approach
This (lua) version is improved from foxcub's answer to remove the need of dummy nodes.
In order to slightly simplify the code in this answer, this version suppose that your lists know their sizes. In the event they don't, you can always find it in
O(n)
time, but even better: a few simple adaptation in the code can be done to not require to compute it beforehand (like subdividing one over two instead of first and second half).To avoid using dummy nodes, you have to compensate for the fact that the two intermediate lists can have different lengths by varying the probability to choose in each list. This is done by testing a [0,1] uniform random number against the ratio of nodes popped from the first list over the total number of node popped (in the two lists).
Down shuffle approach
You can also shuffle while you subdivide recursively, which in my humble tests showed slightly (but consistently) better performance. It might come from the fewer instructions, or on the other hand it might have appeared due to cache warmup in luajit, so you will have to profile for your use cases.
Tests
The full source is in my listShuffle.lua Gist.
It contains code that, when executed, prints a matrix representing, for each element of the input list, the number of times it appears at each position of the output list, after a specified number of run. A fairly uniform matrix 'show' the uniformity of the distribution of characters, hence the uniformity of the shuffle.
Here is an example run with 1000000 iteration using a (non power of two) 3 element list :
I'd say, that foxcub's answer is wrong. To prove that I will introduce a helpful definition for a perfectly shuffled list (call it array or sequence or whatever you want).
Definition: Assume we have a List
L
containing the elementsa1, a2 ... an
and the indexes1, 2, 3..... n
. If we expose theL
to a shuffle operation (to which internals we have no access)L
is perfectly shuffled if and only if by knowing indexes of some k (k< n
) elements we can't deduce the indexes of remainingn-k
elements. That is the remainingn-k
elements are equally probable to be revealed at any of the remainingn-k
indexes.Example: if we have a four element list
[a, b, c, d]
and after shuffling it, we know that its first element isa
([a, .., .., ..]
) than the probability for any of the elementsb, c, d
to occur in, let's say, the third cell equals1/3
.Now, the smallest list for which the algorithm does not fulfil the definition has three elements. But the algorithm converts it to a 4-element list anyway, so we will try to show its incorrectness for a 4-element list.
Consider an input
L = [a, b, c, d]
Following the first run of the algorithm the L will be divided intol1 = [a, c]
andl2 = [b, d]
. After shuffling these two sublists (but before merging into the four-element result) we can get four equally probable 2-elements lists:Now try to answer two questions.
1. What is the probability that after merging into the final result
a
will be the first element of the list.Simply enough, we can see that only two of the four pairs above (again, equally probable) can give such a result (
p1 = 1/2
). For each of these pairsheads
must be drawed during first flipping in the merge routine (p2 = 1/2
). Thus the probability for havinga
as the first element of theLshuffled
isp = p1*p2 = 1/4
, which is correct.2. Knowing that
a
is on the first position of theLshuffled
, what is the probability of havingc
(we could as well chooseb
ord
without loss of generality) on the second position of theLshuffled
Now, according to the above definition of a perfectly shuffled list, the answer should be
1/3
, since there are three numbers to put in the three remaining cells in the listLet's see if the algorithm assures that.
After choosing
1
as the first element of theLshuffled
we would now have either:l1shuffled = [c] l2shuffled = [b, d]
or:
l1shuffled = [c] l2shuffled = [d, b]
The probability of choosing
3
in both cases is equal to the probability of flippingheads
(p3 = 1/2
), thus the probability of having3
as the second element ofLshuffled
, when knowing that the first element element ofLshuffled
is1
equals1/2
.1/2 != 1/3
which ends the proof of the incorrectness of the algorithm.The interesting part is that the algorithm fullfils the necessary (but not sufficient) condition for a perfect shuffle, namely:
Given a list of
n
elements, for every indexk
(<n
), for every elementak
: after shuffling the listm
times, if we have counted the times whenak
occured on thek
index, this count will tend tom/n
by probability, withm
tending to infinity.Here is one possible solution:
This is basically quicksort, but with no pivot, and with random partitioning.
The outer
while
loop replaces a recursive call.The inner
while
loop randomly moves each element into one of two sublists.After the inner
while
loop, we connect the sublists to one another.Then, we recurse on the smaller sublist, and loop on the larger.
Since the smaller sublist can never be more than half the size of the initial list, the worst case depth of recursion is the log base two of the number of elements. The amount of memory needed is O(1) times the depth of recursion.
The average runtime, and number of calls to
rand()
is O(N log N).More precise runtime analysis requires an understanding of the phrase "almost surely."