Generating some random Gaussian coordinates, I noticed the TSP-solver returns horrible solutions, however it also returns the same horrible solution over and over again for the same input.
Given this code:
import numpy
import math
from ortools.constraint_solver import pywrapcp
from ortools.constraint_solver import routing_enums_pb2
import matplotlib
%matplotlib inline
from matplotlib import pyplot, pylab
pylab.rcParams['figure.figsize'] = 20, 10
n_points = 200
orders = numpy.random.randn(n_points, 2)
coordinates = orders.tolist()
class Distance:
def __init__(self, coords):
self.coords = coords
def distance(self, x, y):
return math.sqrt((x[0] - y[0]) ** 2 + (x[1] - y[1]) ** 2)
def __call__(self, x, y):
return self.distance(self.coords[x], self.coords[y])
distance = Distance(coordinates)
search_parameters = pywrapcp.RoutingModel.DefaultSearchParameters()
search_parameters.first_solution_strategy = (
routing_enums_pb2.FirstSolutionStrategy.LOCAL_CHEAPEST_ARC)
search_parameters.local_search_metaheuristic = routing_enums_pb2.LocalSearchMetaheuristic.TABU_SEARCH
routing = pywrapcp.RoutingModel(len(coordinates), 1)
routing.SetArcCostEvaluatorOfAllVehicles(distance)
routing.SetDepot(0)
solver = routing.solver()
routing.CloseModel() # the documentation is a bit unclear on whether this is needed
assignment = routing.SolveWithParameters(search_parameters)
nodes = []
index = routing.Start(0)
while not routing.IsEnd(index):
nodes.append(routing.IndexToNode(index))
index = assignment.Value(routing.NextVar(index))
nodes.append(0)
for (a, b) in zip(nodes, nodes[1:]):
a, b = coordinates[a], coordinates[b]
pyplot.plot([a[0], b[0]], [a[1], b[1]], 'r' )
For example, for 10 points I get a nice solution:
For 20 It's worse, some obvious optimizations still exist (where one only would need to swap two points.
And for 200 it's horrible:
I'm wondering whether the code above actually does some LNS, or just returns the initial value, especially since most first_solution_strategy
options suggest deterministic initialization.
Why does the TSP-solver above return consistent solutions for the same data, even though tabu-search and simulated annealing and such are stochastic. And, why is the 200-point solution so bad?
I played with several options in SearchParameters, especially enabling 'use_...' fields in local_search_operators
. This had no effect, the same very sub-optimal solutions were returned.
I think the problem is with the distance-measure :). I can remember a
kScalingFactor
in the C-code samples from or-tools, which was used to scale up distances, and then round (by casting) them to integers: or-tools expects distances to be integers.Or course, distances between standard Gaussian random coordinates typically lie between 0 and maybe 2, hence most point pairs have the same distances when mapped to integers: garbage in, garbage out.
I fixed it by simply multiplying and casting to integers (just to be sure swig won't interpreter the floats as integers):
Then the results make a lot more sense:
10 points:
20 points:
200 points: