Is partitioning an array into halves with equal su

2019-01-18 15:48发布

问题:

This was an algorithm interview question about the Partition Problem.

You are given an array which consists of numbers with between 0 and 5 digits. Write a function which will return whether the array can be divided into 2 halves such a that sum of the two halves would be equal.

Is this an NP problem or can it be solved by dynamic programming?

"between 0 and 5 digit" means 0 ~ 99999, I think.


I found a good answer to this outside SO here.

回答1:

Both. The problem is in NP, and I'm pretty sure it can be solved in pseudo-polynomial time with dynamic programming.

http://en.wikipedia.org/wiki/Partition_problem

The general problem is NP-complete, and can be solved in psuedo-polynomial time with dynamic programming. T

This specific restriction of it to numbers of 0-5 digits probably isn't NP-complete. What's a 0-digit number, anyway?

Usually for the partition problem there is no requirement that the partitions be of equal size, just that they have equal sum. But here you say "divided into 2 half", I'm not sure whether "half" is intended to mean half the array, or just to mean half the total.

I'd guess that this difference, if it is a difference, probably doesn't affect the complexity of the DP solution, but I'm not sure. I don't have a proof either way off-hand, beyond, "um, yeah, looks like the kind of thing you can do with DP, I'd have to think about it".



回答2:

This is clearly NP - a solution can be verified in polynomial time.

However it is not NP-complete because the elements are bounded (number between 0 to 5 digit range).

The problem is known to undergo a "phase transition", being easy for m / n < 1 and hard otherwise



回答3:

Interview questions are designed, not exclusively, to test problem solving, but also a problem analysis. Therefore this problem isn't specific enough.

Can I choose any two subsets of elements, or maybe they have to be consistent parts? In the first case the problem is NP-Complete (see http://en.wikipedia.org/wiki/Partition_problem for more), in the second one it is of course trivial (interview questions often end in trivial programming tasks).

If the general problem is NP-Complete, maybe we can try specializing it? What do we know about the array? Maybe we can simplify the problem making some external assumptions?

I think this is the way of dealing with such problems, desired by the inteviewer.

Update: For specified number range the problem is equivalent to discrete knapsack problem (with size of the knapsack limited to a half of the sum of whole array and set of the weights of size 100000, that means, we can solve it in O(n) time) See http://en.wikipedia.org/wiki/Knapsack_problem for more about the Knapsack problem.



回答4:

That's right, if the numbers of the set are bounded, then the problem get's polinomial complexity.

You can use bin-search techniques.

Read the wikipage for the Partition_Problem and then, at the begining, you'll find a reference to the Subset_Sum_Problem that it's almost equivalent. You should find meaningful reading that wiki page also



回答5:

You can adapt this algorithm to the problem of partition, does not require many modifications.

Linear Algorithm for Solving the Sum Subset NP-Complete Problem pure Python Implementation

https://github.com/maxtuno/Universal/blob/master/linear_sum_subset_algorithm_oscar_riveros.py

#!/usr/bin/env python3

__author__ = "O. A. Riveros"
__copyright__ = "Copyright 2014, O. A. Riveros, All right reserved"
__license__ = "MIT"
__email__ = "oscar.riveros@gmail.com"

"""
O. A. Riveros
http://independent.academia.edu/oarr
http://twitter.com/maxtuno
"""

"""
first 100 primes
"""
data = [2,3,5,7,11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,97,101,103,107,109,113,127,131,137,139,149,151,157,163,167,173,179,181,191,193,197,199,211,223,227,229,233,239,241,251,257,263,269,271,277,281,283,293,307,311,313,317,331,337,347,349,353,359,367,373,379,383,389,397,401,409,419,421,431,433,439,443,449,457,461,463,467,479,487,491,499,503,509,521,523,541]

solution = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]

"""
for custom entertainment
solution = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
"""

target = sum([data[i] for i in range(len(solution)) if solution[i] == 1])

l = len(data)

size = len('{0:b}'.format(sum(data)))

def rotate(l,n):
    return l[n:] + l[:n]

def slice(l, n):
    return list(int(l[i:i+n], 2) for i in range(0, len(l), n))

data_second_order = []
for i in range(l):
    data_second_order += rotate(data, i)

T = 0
for i in range(l):
    T += int(''.join('{0:b}'.format(k).zfill(size) for k in rotate(data_second_order, i)), 2)
    t = slice('{0:b}'.format(T).zfill(size*(l**2)), size)
    if (target in t):
        '''
        for review the results
        print('The target {} found in {} steps from {}.'.format(target, i + 1, t))
        '''
        print('The target {} found in {} steps'.format(target, i + 1))
        break

"""
The target 24133 found in 100 steps
"""