It would be great if someone could point me towards an algorithm that would allow me to :
- create a random square matrix, with entries 0 and 1, such that
- every row and every column contain exactly two non-zero entries,
- two non-zero entries cannot be adjacent,
- all possible matrices are equiprobable.
Right now I manage to achieve points 1 and 2 doing the following : such a matrix can be transformed, using suitable permutations of rows and columns, into a diagonal block matrix with blocks of the form
1 1 0 0 ... 0
0 1 1 0 ... 0
0 0 1 1 ... 0
.............
1 0 0 0 ... 1
So I start from such a matrix using a partition of [0, ..., n-1] and scramble it by permuting rows and columns randomly. Unfortunately, I can't find a way to integrate the adjacency condition, and I am quite sure that my algorithm won't treat all the matrices equally.
Update
I have managed to achieve point 3. The answer was actually straight under my nose : the block matrix I am creating contains all the information needed to take into account the adjacency condition. First some properties and definitions:
- a suitable matrix defines permutations of
[1, ..., n]
that can be build like so: select a 1 in row1
. The column containing this entry contains exactly one other entry equal to 1 on a rowa
different from 1. Again, rowa
contains another entry 1 in a column which contains a second entry 1 on a rowb
, and so on. This starts a permutation1 -> a -> b ...
.
For instance, with the following matrix, starting with the marked entry
v
1 0 1 0 0 0 | 1
0 1 0 0 0 1 | 2
1 0 0 1 0 0 | 3
0 0 1 0 1 0 | 4
0 0 0 1 0 1 | 5
0 1 0 0 1 0 | 6
------------+--
1 2 3 4 5 6 |
we get permutation 1 -> 3 -> 5 -> 2 -> 6 -> 4 -> 1
.
- the cycles of such a permutation lead to the block matrix I mentioned earlier. I also mentioned scrambling the block matrix using arbitrary permutations on the rows and columns to rebuild a matrix compatible with the requirements.
But I was using any permutation, which led to some adjacent non-zero entries. To avoid that, I have to choose permutations that separate rows (and columns) that are adjacent in the block matrix. Actually, to be more precise, if two rows belong to a same block and are cyclically consecutive (the first and last rows of a block are considered consecutive too), then the permutation I want to apply has to move these rows into non-consecutive rows of the final matrix (I will call two rows incompatible in that case).
So the question becomes : How to build all such permutations ?
The simplest idea is to build a permutation progressively by randomly adding rows that are compatible with the previous one. As an example, consider the case n = 6
using partition 6 = 3 + 3
and the corresponding block matrix
1 1 0 0 0 0 | 1
0 1 1 0 0 0 | 2
1 0 1 0 0 0 | 3
0 0 0 1 1 0 | 4
0 0 0 0 1 1 | 5
0 0 0 1 0 1 | 6
------------+--
1 2 3 4 5 6 |
Here rows 1
, 2
and 3
are mutually incompatible, as are 4
, 5
and 6
. Choose a random row, say 3
.
We will write a permutation as an array: [2, 5, 6, 4, 3, 1]
meaning 1 -> 2
, 2 -> 5
, 3 -> 6
, ... This means that row 2
of the block matrix will become the first row of the final matrix, row 5
will become the second row, and so on.
Now let's build a suitable permutation by choosing randomly a row, say 3
:
p = [3, ...]
The next row will then be chosen randomly among the remaining rows that are compatible with 3
: 4
, 5
and 6
. Say we choose 4
:
p = [3, 4, ...]
Next choice has to be made among 1
and 2
, for instance 1
:
p = [3, 4, 1, ...]
And so on: p = [3, 4, 1, 5, 2, 6]
.
Applying this permutation to the block matrix, we get:
1 0 1 0 0 0 | 3
0 0 0 1 1 0 | 4
1 1 0 0 0 0 | 1
0 0 0 0 1 1 | 5
0 1 1 0 0 0 | 2
0 0 0 1 0 1 | 6
------------+--
1 2 3 4 5 6 |
Doing so, we manage to vertically isolate all non-zero entries. Same has to be done with the columns, for instance by using permutation p' = [6, 3, 5, 1, 4, 2]
to finally get
0 1 0 1 0 0 | 3
0 0 1 0 1 0 | 4
0 0 0 1 0 1 | 1
1 0 1 0 0 0 | 5
0 1 0 0 0 1 | 2
1 0 0 0 1 0 | 6
------------+--
6 3 5 1 4 2 |
So this seems to work quite efficiently, but building these permutations needs to be done with caution, because one can easily be stuck: for instance, with n=6
and partition 6 = 2 + 2 + 2
, following the construction rules set up earlier can lead to p = [1, 3, 2, 4, ...]
. Unfortunately, 5
and 6
are incompatible, so choosing one or the other makes the last choice impossible. I think I've found all situations that lead to a dead end. I will denote by r
the set of remaining choices:
p = [..., x, ?]
,r = {y}
withx
andy
incompatiblep = [..., x, ?, ?]
,r = {y, z}
withy
andz
being both incompatible withx
(no choice can be made)p = [..., ?, ?]
,r = {x, y}
withx
andy
incompatible (any choice would lead to situation 1)p = [..., ?, ?, ?]
,r = {x, y, z}
withx
,y
andz
being cyclically consecutive (choosingx
orz
would lead to situation 2, choosingy
to situation 3)p = [..., w, ?, ?, ?]
,r = {x, y, z}
withxwy
being a 3-cycle (neitherx
nory
can be chosen, choosingz
would lead to situation 3)p = [..., ?, ?, ?, ?]
,r = {w, x, y, z}
withwxyz
being a 4-cycle (any choice would lead to situation 4)p = [..., ?, ?, ?, ?]
,r = {w, x, y, z}
withxyz
being a 3-cycle (choosingw
would lead to situation 4, choosing any other would lead to situation 4)
Now it seems that the following algorithm gives all suitable permutations:
- As long as there are strictly more than 5 numbers to choose, choose randomly among the compatible ones.
- If there are 5 numbers left to choose: if the remaining numbers contain a 3-cycle or a 4-cycle, break that cycle (i.e. choose a number belonging to that cycle).
- If there are 4 numbers left to choose: if the remaining numbers contain three cyclically consecutive numbers, choose one of them.
- If there are 3 numbers left to choose: if the remaining numbers contain two cyclically consecutive numbers, choose one of them.
I am quite sure that this allows me to generate all suitable permutations and, hence, all suitable matrices.
Unfortunately, every matrix will be obtained several times, depending on the partition that was chosen.
Intro
Here is some prototype-approach, trying to solve the more general task of uniform combinatorial sampling, which for our approach here means: we can use this approach for everything which we can formulate as SAT-problem.
It's not exploiting your problem directly and takes a heavy detour. This detour to the SAT-problem can help in regards to theory (more powerful general theoretical results) and efficiency (SAT-solvers).
That being said, it's not an approach if you want to sample within seconds or less (in my experiments), at least while being concerned about uniformity.
Theory
The approach, based on results from complexity-theory, follows this work:
The basic idea:
The guarantees:
The drawbacks:
Parameters:
In practice, the parameters are:
N is important to reduce the number of possible solutions. Given N constant, the other variables of course also have some effect on that.
Theory says (if i interpret correctly), that we should use L = R = 0.5 * #dec-vars.
This is impossible in practice here, as xor-constraints hurt SAT-solvers a lot!
Here some more scientific slides about the impact of L and U.
They call xors of size 8-20 short-XORS, while we will need to use even shorter ones later!
Implementation
Final version
Here is a pretty hacky implementation in python, using the XorSample scripts from here.
The underlying SAT-solver in use is Cryptominisat.
The code basically boils down to:
Code: (i hope i did warn you already about the code-quality)
Output will look like (removed some warnings):
Core: CNF-formulation
We introduce one variable for every cell of the matrix. N=20 means 400 binary-variables.
Adjancency:
Precalculate all symmetry-reduced conflicts and add conflict-clauses.
Basic theory:
Row/Col-wise Cardinality:
This is tough to express in CNF and naive approaches need an exponential number of constraints.
We use some adder-circuit based encoding (SINZ, Carsten. Towards an optimal CNF encoding of boolean cardinality constraints) which introduces new auxiliary-variables.
Remark:
These SAT-encodings can be put into exact model-counters (for small N; e.g. < 9). The number of solutions equals Ante's results, which is a strong indication for a correct transformation!
There are also interesting approximate model-counters (also heavily based on xor-constraints) like approxMC which shows one more thing we can do with the SAT-formulation. But in practice i have not been able to use these (approxMC = autoconf; no comment).
Other experiments
I did also build a version using pblib, to use more powerful cardinality-formulations for the SAT-CNF formulation. I did not try to use the C++-based API, but only the reduced pbencoder, which automatically selects some best encoding, which was way worse than my encoding used above (which is best is still a research-problem; often even redundant-constraints can help).
Empirical analysis
For the sake of obtaining some sample-size (given my patience), i only computed samples for N=15. In this case we used:
I also computed some samples for N=20 with (100,3,6), but this takes a few mins and we reduced the lower bound!
Visualization
Here some animation (strengthening my love-hate relationship with matplotlib):
Edit: And a (reduced) comparison to brute-force uniform-sampling with N=5 (NXOR,L,U = 4, 10, 30):
(I have not yet decided on the addition of the plotting-code. It's as ugly as the above one and people might look too much into my statistical shambles; normalizations and co.)
Theory
Statistical analysis is probably hard to do as the underlying problem is of such combinatoric nature. It's even not entirely obvious how that final cell-PDF should look like. In the case of N=odd, it's probably non-uniform and looks like a chess-board (i did brute-force check N=5 to observe this).
One thing we can be sure about (imho): symmetry!
Given a cell-PDF matrix, we should expect, that the matrix is symmetric (A = A.T). This is checked in the visualization and the euclidean-norm of differences over time is plotted.
We can do the same on some other observation: observed pairings.
For N=3, we can observe the following pairs:
Now we can do this per-row and per-column and should expect symmetry too!
Sadly, it's probably not easy to say something about the variance and therefore the needed samples to speak about confidence!
Observation
According to my simplified perception, current-samples and the cell-PDF look good, although convergence is not achieved yet (or we are far away from uniformity).
The more important aspect are probably the two norms, nicely decreasing towards 0. (yes; one could tune some algorithm for that by transposing with prob=0.5; but this is not done here as it would defeat it's purpose).
Potential next steps
Summary
This solution use recursion in order to set the cell of the matrix one by one. If the random walk finish with an impossible solution then we rollback one step in the tree and we continue the random walk.
The algorithm is efficient and i think that the generated data are highly equiprobable.
The output is:
Just few thoughts. Number of matrices satisfying conditions for n <= 10:
Unfortunatelly there is no sequence with these numbers in OEIS.
There is one similar (A001499), without condition for neighbouring one's. Number of nxn matrices in this case is 'of order' as A001499's number of (n-1)x(n-1) matrices. That is to be expected since number of ways to fill one row in this case, position 2 one's in n places with at least one zero between them is ((n-1) choose 2). Same as to position 2 one's in (n-1) places without the restriction.
I don't think there is an easy connection between these matrix of order n and A001499 matrix of order n-1, meaning that if we have A001499 matrix than we can construct some of these matrices.
With this, for n=20, number of matrices is >10^30. Quite a lot :-/
Here is a very fast approach of generating the matrix row by row, written in Java:
It essentially generates each a random row at a time. If the row will violate any of the conditions, it is generated again (again randomly). I believe this will satisfy condition 4 as well.
Adding a quick note that it will spin forever for N-s where there is no solutions (like N=3).
(Updated test results, example run-through and code snippets below.)
You can use dynamic programming to calculate the number of solutions resulting from every state (in a much more efficient way than a brute-force algorithm), and use those (pre-calculated) values to create equiprobable random solutions.
Consider the example of a 7x7 matrix; at the start, the state is:
meaning that there are seven adjacent unused columns. After adding two ones to the first row, the state could be e.g.:
with two columns that now have a one in them. After adding ones to the second row, the state could be e.g.:
After three rows are filled, there is a possibility that a column will have its maximum of two ones; this effectively splits the matrix into two independent zones:
These zones are independent in the sense that the no-adjacent-ones rule has no effect when adding ones to different zones, and the order of the zones has no effect on the number of solutions.
In order to use these states as signatures for types of solutions, we have to transform them into a canonical notation. First, we have to take into account the fact that columns with only 1 one in them may be unusable in the next row, because they contain a one in the current row. So instead of a binary notation, we have to use a ternary notation, e.g.:
where the 2 means that this column was used in the current row (and not that there are 2 ones in the column). At the next step, we should then convert the twos back into ones.
Additionally, we can also mirror the seperate groups to put them into their lexicographically smallest notation:
Lastly, we sort the seperate groups from small to large, and then lexicographically, so that a state in a larger matrix may be e.g.:
Then, when calculating the number of solutions resulting from each state, we can use memoization using the canonical notation of each state as a key.
Creating a dictionary of the states and the number of solutions for each of them only needs to be done once, and a table for larger matrices can probably be used for smaller matrices too.
Practically, you'd generate a random number between 0 and the total number of solutions, and then for every row, you'd look at the different states you could create from the current state, look at the number of unique solutions each one would generate, and see which option leads to the solution that corresponds with your randomly generated number.
Note that every state and the corresponding key can only occur in a particular row, so you can store the keys in seperate dictionaries per row.
TEST RESULTS
A first test using unoptimized JavaScript gave very promising results. With dynamic programming, calculating the number of solutions for a 10x10 matrix now takes a second, where a brute-force algorithm took several hours (and this is the part of the algorithm that only needs to be done once). The size of the dictionary with the signatures and numbers of solutions grows with a diminishing factor approaching 2.5 for each step in size; the time to generate it grows with a factor of around 3.
These are the number of solutions, states, signatures (total size of the dictionaries), and maximum number of signatures per row (largest dictionary per row) that are created:
(Additional results obtained with C++, using a simple 128-bit integer implementation.)
EXAMPLE
The dictionary for a 5x5 matrix looks like this:
The total number of solutions is 16; if we randomly pick a number from 0 to 15, e.g. 13, we can find the corresponding (i.e. the 14th) solution like this:
This tells us that the 14th solution is the 2nd solution of option 00101. The next step is:
This tells us that the 2nd solution is the 1st solution of option 01010. The next step is:
This tells us that the 1st solution is the 1st solution of option 10100. The next step is:
This tells us that the 1st solutions is the 1st solution of option 01010. The last step is:
And the 5x5 matrix corresponding to randomly chosen number 13 is:
And here's a quick'n'dirty code example; run the snippet to generate the signature and solution count dictionary, and generate a random 10x10 matrix (it takes a second to generate the dictionary; once that is done, it generates random solutions in half a millisecond):
The code snippet below shows the dictionary of signatures and solution counts for a matrix of size 10x10. I've used a slightly different signature format from the explanation above: the zones are delimited by a '2' instead of a plus sign, and a column which has a one in the previous row is marked with a '3' instead of a '2'. This shows how the keys could be stored in a file as integers with 2×N bits (padded with 2's).