OR-tools consistently returns very sub-optimal TSP

2020-07-06 04:50发布

问题:

Generating some random Gaussian coordinates, I noticed the TSP-solver returns horrible solutions, however it also returns the same horrible solution over and over again for the same input.

Given this code:

import numpy
import math
from ortools.constraint_solver import pywrapcp
from ortools.constraint_solver import routing_enums_pb2

import matplotlib
%matplotlib inline
from matplotlib import pyplot, pylab
pylab.rcParams['figure.figsize'] = 20, 10


n_points = 200

orders = numpy.random.randn(n_points, 2)
coordinates = orders.tolist()

class Distance:
    def __init__(self, coords):
        self.coords = coords

    def distance(self, x, y):
        return math.sqrt((x[0] - y[0]) ** 2 + (x[1] - y[1]) ** 2)

    def __call__(self, x, y):
        return self.distance(self.coords[x], self.coords[y])

distance = Distance(coordinates)

search_parameters = pywrapcp.RoutingModel.DefaultSearchParameters()
search_parameters.first_solution_strategy = (
        routing_enums_pb2.FirstSolutionStrategy.LOCAL_CHEAPEST_ARC)

search_parameters.local_search_metaheuristic = routing_enums_pb2.LocalSearchMetaheuristic.TABU_SEARCH


routing = pywrapcp.RoutingModel(len(coordinates), 1)
routing.SetArcCostEvaluatorOfAllVehicles(distance)   

routing.SetDepot(0)
solver = routing.solver()
routing.CloseModel() # the documentation is a bit unclear on whether this is needed

assignment = routing.SolveWithParameters(search_parameters)

nodes = []
index = routing.Start(0)
while not routing.IsEnd(index):
    nodes.append(routing.IndexToNode(index))
    index = assignment.Value(routing.NextVar(index))

nodes.append(0)
for (a, b) in zip(nodes, nodes[1:]):
    a, b = coordinates[a], coordinates[b]
    pyplot.plot([a[0], b[0]], [a[1], b[1]], 'r' )

For example, for 10 points I get a nice solution:

For 20 It's worse, some obvious optimizations still exist (where one only would need to swap two points.

And for 200 it's horrible:

I'm wondering whether the code above actually does some LNS, or just returns the initial value, especially since most first_solution_strategy options suggest deterministic initialization.

Why does the TSP-solver above return consistent solutions for the same data, even though tabu-search and simulated annealing and such are stochastic. And, why is the 200-point solution so bad?

I played with several options in SearchParameters, especially enabling 'use_...' fields in local_search_operators. This had no effect, the same very sub-optimal solutions were returned.

回答1:

I think the problem is with the distance-measure :). I can remember a kScalingFactor in the C-code samples from or-tools, which was used to scale up distances, and then round (by casting) them to integers: or-tools expects distances to be integers.

Or course, distances between standard Gaussian random coordinates typically lie between 0 and maybe 2, hence most point pairs have the same distances when mapped to integers: garbage in, garbage out.

I fixed it by simply multiplying and casting to integers (just to be sure swig won't interpreter the floats as integers):

# ...
def distance(self, x, y):
    return int(10000 * math.sqrt((x[0] - y[0]) ** 2 + (x[1] - y[1]) ** 2))
# ...

Then the results make a lot more sense:

10 points:

20 points:

200 points: