I have a bit of a technical issue, but I feel like it should be possible with MATLAB's powerful toolset.
What I have is a random n by n matrix of 0's and w's, say generated with
A=w*(rand(n,n)<p);
A typical value of w would be 3000, but that should not matter too much.
Now, this matrix has two important quantities, the vectors
c = sum(A,1);
r = sum(A,2)';
These are two row vectors, the first denotes the sum of each column and the second the sum of each row.
What I want to do next is randomize each value of w, for example between 0.5 and 2. This I would do as
rand_M = (0.5-2).*rand(n,n) + 0.5
A_rand = rand_M.*A;
However, I don't want to just pick these random numbers: I want them to be such that for every column and row, the sums are still equal to the elements of c and r. So to clean up the notation a bit, say we define
A_rand_c = sum(A_rand,1);
A_rand_r = sum(A_rand,2)';
I want that for all j = 1:n, A_rand_c(j) = c(j)
and A_rand_r(j) = r(j)
.
What I'm looking for is a way to redraw the elements of rand_M in a sort of algorithmic fashion I suppose, so that these demands are finally satisfied.
Now of course, unless I have infinite amounts of time this might not really happen. I therefore accept these quantities to fall into a specific range: A_rand_c(j)
has to be an element of [(1-e)*c(j),(1+e)*c(j)]
and A_rand_r(j)
of [(1-e)*r(j),(1+e)*r(j)]
. This e I define beforehand, say like 0.001 or something.
Would anyone be able to help me in the process of finding a way to do this? I've tried an approach where I just randomly repick the numbers, but this really isn't getting me anywhere. It does not have to be crazy efficient either, I just need it to work in finite time for networks of size, say, n = 50.
To be clear, the final output is the matrix A_rand
that satisfies these constraints.
Edit:
Alright, so after thinking a bit I suppose it might be doable with some while statement, that goes through every element of the matrix. The difficult part is that there are four possibilities: if you are in a specific element A_rand(i,j)
, it could be that A_rand_c(j)
and A_rand_r(i)
are both too small, both too large, or opposite. The first two cases are good, because then you can just redraw the random number until it is smaller than the current value and improve the situation. But the other two cases are problematic, as you will improve one situation but not the other. I guess it would have to look at which criteria is less satisfied, so that it tries to fix the one that is worse. But this is not trivial I would say..
You can take advantage of the fact that rows/columns with a single non-zero entry in
A
automatically give you results for that same entry inA_rand
. IfA(2,5) = w
and it is the only non-zero entry in its column, thenA_rand(2,5) = w
as well. What else could it be?You can alternate between finding these single-entry rows/cols, and assigning random numbers to entries where the value doesn't matter.
Here's a skeleton for the process:
A_rand=zeros(size(A))
is the matrix you are going to fillentries_left = A>0
is a binary matrix showing which entries inA_rand
you still need to fillcol_totals=sum(A,1)
is the amount you still need to add in every column ofA_rand
row_totals=sum(A,2)
is the amount you still need to add in every row ofA_rand
Picking the range for
random_value
might be a little tricky. The best I can think of is to draw it from a relatively narrow distribution centered aroundN*w*p
wherep
is the probability of an entry inA
being nonzero (this would be the average value of row/column totals).This doesn't scale well to large matrices as it will grow with
n^2
complexity. I tested it for a 200 by 200 matrix and it worked in about 20 seconds.