I would like to optimize over all 30 by 30 matrices with entries that are 0 or 1. My objective function is the determinant. One way to do this would be some sort of stochastic gradient descent or simulated annealing.
I looked at scipy.optimize
but it doesn't seem to support this sort of optimization as far as I can tell. scipy.optimize.basinhopping
looked very tempting but it seems to require continuous variables.
Are there any tools in Python for this sort of general discrete optimization?
I have looked into this a bit.
A couple of things first off: 1) 56 million is the max value when the size of the matrix is 21x21, not 30x30:https://en.wikipedia.org/wiki/Hadamard%27s_maximal_determinant_problem#Connection_of_the_maximal_determinant_problems_for_.7B1.2C.C2.A0.E2.88.921.7D_and_.7B0.2C.C2.A01.7D_matrices.
But that is also an upper bound on -1, 1 matrices, not 1,0.
EDIT: Reading more carefully from that link:
So that table can be used for upper bounds, but remember they're divided by 2n−1. Also note that 22 is the smallest open case, so trying to find the maximum of a 30x30 matrix has not been done, and is not even close to being done just yet.
2) The reason David Zwicker's code gives an answer of 30 million is probably due to the fact that he's minimising. Not maximising.
See how he's got the minus sign there?
3) Also, the solution space for this problem is very big. I calculate the number of different matrices to be 2^(30*30) i.e. in the order of 10^270. So looking at each matrix is simply impossible, and even look at most of them is too.
I have a bit of code here (adapted from David Zwicker's code) that runs, but I have no idea how close it is to the actual maximum. It takes around 45 mins to do 10 million iterations on my PC, or only about 2 mins for 1 mill iterations. I get a max value of around 3.4 billion. But again, I have no idea how close this is to the theoretical maximum.
Does this help?
I think a genetic algorithm might work quite well in this case. Here's a quick example thrown together using
deap
, based loosely on their example here:It takes about 40 seconds to run 1000 generations on my laptop, which gets me from a minimum determinant value of about -5.7845x108 to -6.41504x1011. I haven't really played around much with the meta-parameters (population size, mutation rate, crossover rate etc.), so I'm sure it's possible to do a lot better.
Here's a greatly improved version that implements a much smarter crossover function that swaps blocks of rows or columns across individuals, and uses a
cachetools.LRUCache
to guarantee that each mutation step produces a novel configuration, and to skip evaluation of the determinant for configurations that have already been tried:My best score thus far is about
-3.23718x1013-3.92366x1013 after100001000 generations, which takes about 45 seconds on my machine.Based on the solution cthonicdaemon linked to in the comments, the maximum determinant for a 31x31 Hadamard matrix must be at least 75960984159088×230 ~= 8.1562x1022 (it's not yet proven whether that solution is optimal). The maximum determinant for an (n-1 x n-1) binary matrix is 21-n times the value for an (n x n) Hadamard matrix, i.e. 8.1562x1022 x 2-30 ~= 7.5961x1013, so the genetic algorithm gets within an order of magnitude of the current best known solution.
However, the fitness function seems to plateau around here, and I'm having a hard time breaking -4x1013. Since it's a heuristic search there is no guarantee that it will eventually find the global optimum.
I don't know of any straight-forward method for discrete optimization in
scipy
. One alternative is using thesimanneal
package from pip or github, which allows you to introduce your own move function, such that you can restrict it to moves within your domain: