Given an array A
of N
nonnegative numbers, I'm interested in finding the number of ways you can pick 5 numbers (from distinct positions in the array) such that their sum is S
.
There is an easy solution in O(N^3)
:
Let H be a hash table of (sum, position of leftmost element of sum)
for i = 0, N
for j = i + 1, N
H.add(A[i] + A[j], i)
numPossibilities = 0
for i = 0, N
for j = i + 1, N
for k = j + 1, N
numPossibilities += H.get(S - (A[i] + A[j] + A[k]), k)
Where H.get(x, y)
returns the number of elements in the hash whose sum has the same hash as x
and whose leftmost element is bigger than k.
Alternatively, we can add sums of 3 elements to the hash table and then continue with 2 nested for loops. The complexity remains the same however, and we just use more memory.
Assuming the inputs will be fairly random (so no worst-case hashing), is there an algorithm that can solve this in O(N^2)
or maybe O(N^2 log N)
, or even O(N^3)
if it holds in all cases? I'm thinking binary searching might help, but I don't see how to deal with overlapping indexes.
The above solution is a lot better in practice than the naive 5-for-loops solution, however I have a feeling we can do a lot better, hence this question.
If you can prove that no such algorithm exists, how can the above solution be optimized?
Clarification:
The above algorithm is indeed O(N^5)
in the worst case, such as when the given array contains nothing but the number 1 and we have S = 5
. On average however, the H.get
method is a lot closer to O(1)
, hence my average cubic complexity.
If you implement this and run it on 1000 random numbers in a big interval (say 0 upto Int32.MaxValue), you will see that it runs relatively fast. Still, it's not hard to find inputs for which it takes a long time. Even if we can't get it running fast enough for all equal numbers, what optimizations could we make?
Under the same assumptions, can we do better, asymptotically or at least in practice?
I think the fact that the numbers must have distinct positions is a red herring. You can use the
inclusion-exclusion principle to count the number of all positions (i,j,k,l,m) where x[i]+x[j]+x[k]+x[l]+x[m]=S and i,j,k,l,m are distinct:
sums with i!=j,i!=k,i!=l...,l!=m = all sums
- sums with i=j
- ...
- sums with l=m
+ sums with i=j and j=k
+ ...
+ sums with k=l and l=m
- ...
+ sums with i=j=k=l=m
Computing the sums on the right, except the first one, is doable in O(N^2 log N). For example, to find the number of positions (i,i,k,l,m) such that x[i]+x[i]+x[k]+x[l]+x[m]=S you can create sorted arrays with sums {2a+b} and {c+d} and check if they have elements x,y such that x+y=S.
Main algorithm
So it's enough to compute how many are there positions (i,j,k,l,m) where x[i]+x[j]+x[k]+x[l]+x[m]=S
and i,j,k,l,m are not necessarily different. Basically, you can use Moron's solution this way:
Create a sorted array of sums {a+b: a,b are numbers from array}; group equal elements into one, remembering count. For example, for array [1,1,3] you get nine sums [2,2,2,2,4,4,4,4,6] of the form a+b. Then you group same elements remembering counts: [(2,4),(4,4),(6,1)]. This step is O(N^2 log N).
For each e, count how many are there pairs of elements in the array that sum to S-e. As in Moron's solution, you have two pointers, one going right, one going left. If the sum is too low, move the first pointer increasing the sum; if the sum is too high, move the second pointer decreasing it.
Suppose the sum is correct. This means one points to (a,x) and second to (b,y) where a+b=S-e. Increase the counter by x*y and move both pointers (You could move only one pointer, but on the next step there would be no match, and the second pointer would be moved then.).
For example, for [(2,4),(4,4),(6,1)] array and S-e=8, the first pointer points at (2,4) and the second at (6,1). Since 2+6=8, you add 4 and move both pointers. Now they both point at (4,4), so you increase the counter by 16. Don't stop! The pointers pass each other, and you get first at (6,1), second at (2,4), increase the counter by 4.
So, in the end, there are 4+16+4=24 ways to get 8 as a sum of 4 elements of [1,1,3]:
>>> len([k for k in itertools.product([1,1,3],repeat=4) if sum(k) == 8])
24
Prelude Control.Monad> length [k | k <- replicateM 4 [1,1,3], sum k == 8]
24
Repeating that for each e, you'll get count of ways to get S as a sum of 5 elements.
For [1,1,1,1,1] and S-e=4, the sums array would be [(2,25)], and you'd get that there are 625 ways to get sum of 4.
For each e, this step is linear in size of the array (so it's O(N2)), so the loop takes O(N3).
On inclusion-exclusion:
Call a quintuple (i,j,k,l,m) "proper" if x[i]+x[j]+x[k]+x[l]+x[m]=S. The goal is to count the number of proper quintuples (i,j,k,l,m) where i,j,k,l,m are pairwise distinct. The main algorithm can count in O(N^3) how many are there proper quintuples which have not necessarily distinct components. The remaining thing is to count those "wrong" tuples.
Consider the subsets of proper quintuples
Axy={(i,j,k,l,m): indices on x-th and y-th place are the same}
For example, A24 is the set of proper quintuples (i,j,k,l,m) where j=l.
The set of wrong quintuples is:
A12 ∪ A13 ∪ ... ∪ A45
Counting its cardinality by inclusion-exclusion:
|A12 ∪ A13 ∪ ... ∪ A45| = |A12| + |A13| + ... + |A45| - |A12 ∩ A23| - ... - |A34 ∩ A45| + ... + |A12 ∩ A23 ∩ ... ∩ A35 ∩ A45|
There are 210=1024 summands here. But a lot of the cardinalities is the same.
The only things you have to count is:
- X1 = |A12| - quintuples with i=j
- X2 = |A12 ∩ A23| - quintuples with i=j=k
- X3 = |A12 ∩ A23 ∩ A34| - quintuples with i=j=k=l
- X4 = |A12 ∩ A23 ∩ A34 ∩ A45| - quintuples with i=j=k=l=m
- X5 = |A12 ∩ A34| - quintuples with i=j,k=l
- X6 = |A12 ∩ A23 ∩ A45| - quintuples with i=j=k,l=m
You can observe, by permuting, all other sets are represented here. For example, A24 has the same cardinality as A12.
Counting cardinalities of those 6 sets is rather easy. For the first one, you create arrays {2a+b} and {c+d} and count how many are there common elements; for the other ones, there are only 3 or less free variables, so even a simple loop will give you O(N^3).
To simplify the sum, I wrote the following Haskell program:
import Control.Monad
import Data.List
import qualified Data.Map as Map
-- Take equivalence relation, like [(1,2),(2,3)] and return its partition, like [3,1,1]
f xs = sort $ map length $ foldr f (map return [1..5]) xs
where f (x,y) a = let [v1] = filter (x `elem`) a
[v2] = filter (y `elem`) a
in if v1 == v2 then a else (a \\ [v1,v2]) ++ [v1++v2]
-- All 1024 subsets of [(1,2),(1,3), ..., (4,5)]
subsets = filterM (const [False, True]) [(i,j) | i <- [1..5], j <- [i+1..5]]
res = Map.fromListWith (+) $ map (\k -> (f k, (-1)^(length k))) subsets
*Main> res
Loading package array-0.3.0.1 ... linking ... done.
Loading package containers-0.3.0.0 ... linking ... done.
fromList [([1,1,1,1,1],1),([1,1,1,2],-10),([1,1,3],20),([1,2,2],15),([1,4],-30),([2,3],-20),([5],24)]
which means that the formula is
all subsets - 10X1 + 20X2 - 30X3 + 24X4 + 15X5 - 20X6.
Check:
How many are there quintuples in [0,0,0,...,0] summing up to 0? One way to compute that is directly, second way is to use the formula (and not care about distinct positions):
direct x = x*(x-1)*(x-2)*(x-3)*(x-4)
indirect x = x^5 - 10 * x^4 + 20 * x^3 + 15 * x^3 - 30 * x^2 - 20*x^2 + 24*x
*Main> direct 100
9034502400
*Main> indirect 100
9034502400
Other remarks:
Also, there's O(an log an) solution: Compute (xa1 + ... + xan)5 using FFT, the result is coefficient at xS. This allows some ai to be used twice, but you can subtract polynomials like (x2a1 + ... + x2an)5*(xa1 + ... + xan)3 etc. according to inclusion-exclusion principle.
In some restricted models of computation, it has been shown decision version of this problem requires O(N^3) time.
O(N^3) seem possible (though I haven't tried proving it).
Take all possible pairs and create a new array (say B) of size O(N^2) which holds the sum of all possible pairs. Also keep track of the index of two elements from the original array which gave that sum. - O(N^2)
Now sort the array - O(N^2LogN).
Now for each element a in the original array, try to find two elements from B which sum to S-a. Since B is sorted this can be done in O(B) time: Start with two pointers, one at the max and one at the min.
If Sum of those two > S-a, decrement the pointer near max.
If Sum of those two < S-a, increment the pointer near min.
If the sum is equal, then you have found one candidate pair and a new sorted sub-array in which to look for the next possible candidate pair. (You should ensure that the two elements of B come from four elements of A). (There might be potential issues here)
Thus you can count the number of times S-a occurs as a sum of two elements of B, which come from four elements of the original array (not including a).
So O(N^2) time for O(N) elements - O(N^3).
Hope that helps.
Maybe it's better to create an array with only distinct values first and count the occurence of them in the originally array. Because only the number of solutions is wanted and not the solutions itself, that could be faster if using combination calculations.
1) Sort array A
O(N log N)
2) Create a new array B
where all values are distinct.
Also save the count of the occurence of the value in the original array A
for every element in B
. O(N)
3) Create a new array C
with sums of two elements of B
.
Including sums of the same element if the count > 1.
Also save the both indexes of the elements from B
. O(|B|2)
4) Sort array C
by the sums O(|B|2 (log |B|2))
5) For every element in B
find two valid elements from C
so that the three values sum up to S
and the indexes are in the same order. In pseudo code:
num=0
for (i=0; i<n; i++)
j=i
k=|C|-1
while (j <= k)
if (c[j].sum + c[k].sum = S - b[i].value)
for (m=0; m<c[j].index.length; m++)
for (n=0; n<c[k].index.length; n++)
if (i < c[j].index[m].left < c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
num+=b[i].count * b[c[j].index[m].left].count * b[c[j].index[m].right].count * b[c[j].index[k].left].count * b[c[j].index[k].right].count
else if (b[i].count > 1 && i = c[j].index[m].left < c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
num+= binomialcoefficient(b[i].count, 2) * b[c[j].index[m].right].count * b[c[j].index[k].left].count * b[c[j].index[k].right].count
else if (b[c[j].index[m].left].count > 1 && i < c[j].index[m].left = c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
num+= b[i].count * binomialcoefficient(b[c[j].index[m].left].count, 2) * b[c[j].index[k].left].count * b[c[j].index[k].right].count
[..]
else if (b[i].count > 2 && i = c[j].index[m].left = c[j].index[m].right < c[j].index[k].left < c[j].index[k].right)
num+= binomialcoefficient(b[i].count, 3) * b[c[j].index[k].left].count * b[c[j].index[k].right].count
[..]
else if (b[i].count > 1 && b[c[j].index[m].right].count > 1 && i = c[j].index[m].left < c[j].index[m].right = c[j].index[k].left < c[j].index[k].right)
num+= binomialcoefficient(b[i].count, 2) * binomialcoefficient(b[c[j].index[m].right].count, 2) * b[c[j].index[k].right].count
[..]
else if (b[i].count > 4 && i = c[j].index[m].left = c[j].index[m].right = c[j].index[k].left = c[j].index[k].right)
num+= binomialcoefficient(b[i].count, 5)
if (c[j].sum + c[k].sum >= S - b[i].value)
k--
if (c[j].sum + c[k].sum <= S - b[i].value)
j++
I'm not really sure what time complexity this has. The outer for loop is bound by O(|B|), the while loop by O(|B|2), the inner for loops by O(|B|), because B
has only distinct values.
So obvisouly it's in O(|B|5). But its O(N) if all elements in A
have the same value and if all values are distinct and sufficiently random its maybe possible to bound the number of indexes per sum in C
by a constant, which would lead to O(N3).
The worst case could be somewhere with half the values equal and the other half random or with all numbers distinct but a lot of duplicate sums. But that would also lead to a much shorter while loop. I have a feeling that the while and the two inner for loops are bound by O(N2), so O(N3) in total for all cases, but i can't proof it.
Also an interesting question here is what is the maximum number of possibilities to pick up 5 numbers which sum to S for an array of N disctinct numbers. If it is in O(N5) the worst case of that algorithm is also O(N5).
Maybe try it out ;).