finding the optimized location for a list of coord

2019-07-13 09:26发布

问题:

I am a new to programming and particularly python but I am trying to learn it and I found it to be very fascinating so far.

I have a list of 30 fixed coordinates x and y.

x = np.array([[13,10,12,13,11,12,11,13,12,13,14,15,15,16,18,2,3,4,6,9,1,3,6,7,8,10,12,11,10,30]])
y = np.array([[12,11,10,9,8,7,6,6,7,8,11,12,13,15,14,18,12,11,10,13,15,16,18,17,16,15,14,13,12,3]])

I want to find an optimized (centralized) location that can connect up to a maximum of the 10 fixed coordinates by finding the minimum distance.

I tired to use an optimization algorithm but i can only get one optimum location for a list of coordinates (i.e. i cannot constraint the location to 10 coordinates).

Any help would be appreciated.

def objective(x):
    px = x[0]
    py = x[1]
    wdistance = sum(((px - wx)**2 + (py - wy)**2)**0.5)
    tdistance = (((px - tx)**2 + (py - ty)**2)**0.5)
    cost = wdistance * 2.5 + tdistance * 5 
    return cost

# initial estimate of the centralized location 
x0 = [10,10]

# boundary values for the centralized locations 
bnds = ((0,15), (0,15))
sol = minimize(objective, x0, bounds=bnds, method='SLSQP')

print(sol)

回答1:

As indicated in my comment, i think this problem is NP-hard and therefore it's not easy to tackle it.

It's a problem with some kind of cardinality-constraints, which means, it's hard/impossible to use continuous-optimization (assumed for nearly everything in scipy.optimize) in general.

Furthermore, minimizing the minimum-distance is not particularly smooth, also assumed by pretty much everything within scipy. But reformulations are possible (at least when constraints are supported!).

Without the cardinality-constraints (core of the discrete nature of this problem), it collapses into the Smallest Enclosing Circle Problem problem, easily solved in linear-time. More general wiki-article. But this does not help much for the given problem.

Here is a demo, probably not for the faint of heart (depending on the background-knowledge):

  • based on Mixed Integer Second Order Cone Programming

    • will solve to optimality (epsilon-approximation) given infinite time (ignoring numerical-issues; fp-math)
    • does not beat NP-hardness; but might be good in exploiting some kind of structure in your data
    • feels natural for euclidean-distance (norm) based problems
  • using cvxpy as modelling-tool

  • using ECOS_BB as solver
    • warning: it's a proof-of-concept / toy solver (which we will see in our results)
    • MISOCP / MICP is an active area of research, but the amount of non-commercial specialized solvers is sparse!
      • Instead of ECOS_BB, one probably should use Pajarito, maybe Bonmin or the commercials (Gurobi, CPLEX, Mosek and co.)
      • Pajarito, which looks interesting is julia based and (imho) not fun to use with python; therefore consider my code a demo with a demo-solver!

The following code will solve the 10 out of 20 problem (first 20 of your points; where one duplicate was thrown away for better visualization!)

It will fail for the full problem (quite ok approximation will be returned; but not optimal!). I blame ECOS_BB (but being very thankful to Han Wang for the code at the same time!). Your problem most surely will be solved easily by better alternatives. But don't expect any solver to do it for 1000 points :-).

Code:

import numpy as np
import cvxpy as cvx
import matplotlib.pyplot as plt
from scipy.spatial.distance import pdist

""" (Reduced) Data """
# removed a duplicate point (for less strange visualization)
x = np.array([13,10,12,13,11,12,11,13,13,14,15,15,16,18,2,3,4,6,9,1])#,3,6,7,8,10,12,11,10,30])
y = np.array([12,11,10,9,8,7,6,6,8,11,12,13,15,14,18,12,11,10,13,15])#,16,18,17,16,15,14,13,12,3])
N = x.shape[0]
M = 10

""" Mixed-integer Second-order cone problem """
c = cvx.Variable(2)                                # center position to optimize
d = cvx.Variable(N)              # helper-var: lower-bound on distance to center
d_prod = cvx.Variable(N) # helper-var: lower-bound on distance * binary/selected
b = cvx.Bool(N)                          # binary-variables used for >= M points

dists = pdist(np.vstack((x,y)).T)
U = np.amax(dists)    # upper distance-bound (not tight!) for bigM-linearization

helper_expr_c_0 = cvx.vec(c[0] - x)
helper_expr_c_1 = cvx.vec(c[1] - y)

helper_expr_norm = cvx.norm(cvx.hstack(helper_expr_c_0, helper_expr_c_1), axis=1)

constraints = []
constraints.append(cvx.sum_entries(b) >= M)           # M or more points covered
constraints.append(d >= helper_expr_norm)             # lower-bound of distances
constraints.append(d_prod <= U * b)                   # linearization of product
constraints.append(d_prod >= 0)                       # """
constraints.append(d_prod <= d)                       # """
constraints.append(d_prod >= d - U * (1-b))           # """

objective = cvx.Minimize(cvx.max_entries(d_prod))

problem = cvx.Problem(objective, constraints)
problem.solve(verbose=False)
print(problem.status)

# Visualization

center = np.array(c.value.flat)
tmp = np.array(np.round(b.value.T.flat), dtype=int)
selected_points = np.where(tmp == 1)[0]
non_selected_points = np.where(tmp == 0)[0]

ax = plt.subplot(aspect='equal')
ax.set_title('Optimal solution radius: {0:.2f}'.format(problem.value))
ax.scatter(x[non_selected_points], y[non_selected_points], c='blue', s=70, alpha=0.9)
ax.scatter(x[selected_points], y[selected_points], c='magenta', s=70, alpha=0.9)
ax.scatter(c[0].value, c[1].value, c='red', s=70, alpha=0.9)
circ = plt.Circle((center[0], center[1]), radius=problem.value, color='g', fill=True, alpha=0.1)
ax.add_patch(circ)
plt.tight_layout()
plt.show()

Output:

optimal

Plot:

EDIT

I ported the above code to julia to be able to use the mentioned solver Pajarito. Despite not much julia-programming for me in the past, it is working now:

  • using Convex.jl as modelling-tool
    • Nearly 1:1 mapping from the cvxpy-based code above!
  • using a combination of 3 open-source solvers:
    • Pajarito as MISOCP / MICP solver
      • using GLPK as inner MIP-solver (Coin CBC usually much better; but not working for me due to a possible bug within Pajarito)
      • using ECOS as inner Convex-solver
  • reused matplotlib-based plotting-code from above; handcopied result-values :-)

This approach is solving your full problem (again i dropped the duplicate-point) to optimality!

It outputs some numerical instability warning, but claims the result to be optimal! (Pajarito is still pretty new research-software and i did not tune the numerous options; and we use open-source solvers within which are not that robust compared to commercial alternatives)

I'm very satisfied with the result and i'm glad i gave Julia/Convex.jl/Pajarito a try!

Code:

using Convex, Pajarito, ECOS, GLPKMathProgInterface

# DATA
x = [13 10 12 13 11 12 11 13 13 14 15 15 16 18 2 3 4 6 9 1 3 6 7 8 10 12 11 10 30]
y = [12 11 10 9 8 7 6 6 8 11 12 13 15 14 18 12 11 10 13 15 16 18 17 16 15 14 13 12 3]
N = size(x)[2]
M = 10

# MISOCP
c = Variable(2)
d = Variable(N)
d_prod = Variable(N)
b = Variable(N, :Bin)

U = 100  # TODO

# Objective
problem = minimize(maximum(d_prod))

# Constraints
problem.constraints += sum(b) >= M
problem.constraints += d_prod <= U*b
problem.constraints += d_prod >= 0
problem.constraints += d_prod <= d
problem.constraints += d_prod >= d - U * (1-b)
for i = 1:N
    problem.constraints += d[i] >= norm([c[1] - x[i], c[2] - y[i]])  # ugly
end

# Solve
mip_solver_drives = false
log_level = 2
rel_gap = 1e-8
mip_solver = GLPKSolverMIP()
cont_solver = ECOSSolver(verbose=false)
solver = PajaritoSolver(
    mip_solver_drives=mip_solver_drives,
    log_level=log_level,
    rel_gap=rel_gap,
    mip_solver=mip_solver,
    cont_solver=cont_solver,
    init_exp=true,
)
# option taken from https://github.com/JuliaOpt/Pajarito.jl/blob/master/examples/gatesizing.jl
# not necessarily optimal

solve!(problem, solver)

# Print out results
println(problem.status)
println(problem.optval)
println(b.value)
println(c.value)

Output:

Problem dimensions:
       variables |       120
     constraints |       263
   nonzeros in A |       466

Cones summary:
Cone             | Count     | Min dim.  | Max dim.
    Second order |        29 |         3 |         3

Variable types:
      continuous |        91
          binary |        29

Transforming data...               1.10s

Creating conic subproblem...       0.52s

Building MIP model...              2.35s

Solving conic relaxation...        1.38s

Starting iterative algorithm
    0 |           +Inf |  +3.174473e-11 |         Inf |   6.434e+00
Warning: numerical instability (primal simplex, phase I)

Iter. | Best feasible  | Best bound     | Rel. gap    | Time (s)
    1 |  +2.713537e+00 |  +2.648653e+00 |   2.391e-02 |   1.192e+01
Warning: numerical instability (primal simplex, phase I)
Warning: numerical instability (primal simplex, phase I)
Warning: numerical instability (dual simplex, phase II)
... some more ...
    2 |  +2.713537e+00 |  +2.713537e+00 |   5.134e-12 |   1.341e+01

Iterative algorithm summary:
 - Status               =        Optimal
 - Best feasible        =  +2.713537e+00
 - Best bound           =  +2.713537e+00
 - Relative opt. gap    =      5.134e-12
 - Total time (s)       =       1.34e+01

Outer-approximation cuts added:
Cone             | Relax.    | Violated  | Nonviol.
    Second order |        58 |        32 |        24

0 numerically unstable cone duals encountered

Distance to feasibility (negative indicates strict feasibility):
Cone             | Variable  | Constraint
          Linear |        NA |  5.26e-13
    Second order |        NA |  2.20e-11

Distance to integrality of integer/binary variables:
          binary |  0.00e+00

Optimal
2.7135366682753825
[1.0; 1.0; 1.0; 1.0; 0.0; 0.0; 0.0; 0.0; 0.0; 1.0; 1.0; 1.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 0.0; 1.0; 1.0; 1.0; 0.0]
[12.625; 11.6875]

Visualization:

And one more plot for select-23 out of all points: