My task is to create a program that simulates a discrete time Markov Chain, for an arbitrary number of events. However, right now the part I'm struggling with is creating the right stochastic matrix that will represent the probabilities. A right stochastic matrix is a matrix that has row entries that sum to 1. And for a given size, I kind of know how to write the matrix that does that, however, the problem is that I don't know how to do that for an arbitrary size.
For example: here is my code for a 3x3 matrix, along with example of the output I was getting.
However, my code doesn't work every time -- there are certain times when the third entry in the row is negative because the first two are too large. And I don't know how to get around that, as far as I know, there isn't a function in Python that makes it so that you can generate random numbers that sum to something in particular.
Any help is appreciated.
(Note that this isn't a homework problem, it's only for extra credit in my Math class and the professor doesn't mind the use of outside sources.)
One small point has been missed. A stochastic matrix is an M x N matrix of non-negative elements which rows sum to 1.0. MBo comment above states that:
This is only true if the stored matrix is comprised entirely of whole numbers (not necessarily integers). Otherwise the resulting matrix may contain negative numbers, the larger the matrix, the more the negative elements.
This can be accomplished using:
Using @MBo's idea:
Here's a sample you can create a function from, I leave that to you as homework
Explanation
We create an N x M matrix
We then calculate the
(sum - 1) / N
to be subtracted from each item row-wiseThen we apply it to each row of the matrix by using
np.apply_along_axis()
withaxis=1
to be applied on each rowVerify the result
Each row needs to sum up to 1
In my example I've used a
lambda
that is equivalent to this functionYou can pass a function to
apply_along_axis()
to be called on each element on the axis, in our case it's the rowsThere are other ways too like numpy.vectorize() and numpy.frompyfunc
Making a function and apply it like any method from the above is better than looping through each item in each row, faster and less code, easier to read / understand the intent
Generate NxN matrix with random values.
For every row:
Find sum of row S
S[j] = Sum(0..N-1){A[j, i]}
Then subtract (S-1)/N from every value in this row
A[j, i] = A[j, i] - (S[j] - 1) / N
If you need only non-negative values, generate non-negative randoms, and divide every value in row by sum of this row
A[j, i] = A[j, i] / S[j]
Here is some code:
I assumed the random numbers have to be positive, the sum of numbers on a raw has to be 1. I used a precision give in variable 'precision', if this is 1000 it means that the random numbers will have 3 digits after the comma. In y example 6 digits are used, you may use more.
Output: